Merge pull request #11 from ansible/hadoop

Import hadoop deployment playbooks
11 years ago · 4056f51f10
parent 9e48d2768b 0c387c0211
commit 4056f51f10
51 changed files with 2244 additions and 0 deletions
--- a/hadoop/README.md
+++ b/hadoop/README.md
@ -0,0 +1,359 @@
+# Deploying Hadoop Clusters using Ansible
+
+## Preface
+
+The playbooks in this example are designed to deploy a Hadoop cluster on a
+CentOS 6 or RHEL 6 environment using Ansible. The playbooks can:
+
+1) Deploy a fully functional Hadoop cluster with HA and automatic failover.
+
+2) Deploy a fully functional Hadoop cluster with no HA.
+
+3) Deploy additional nodes to scale the cluster
+
+These playbooks require Ansible 1.2, CentOS 6 or RHEL 6 target machines, and install
+the open-source Cloudera Hadoop Distribution (CDH) version 4.
+
+## Hadoop Components
+
+Hadoop is framework that allows processing of large datasets across large
+clusters. The two main components that make up a Hadoop cluster are the HDFS
+Filesystem and the MapReduce framework. Briefly, the HDFS filesystem is responsible 
+for storing data across the cluster nodes on its local disks. The MapReduce
+jobs are the tasks that would run on these nodes to get a meaningful result
+using the data stored on the HDFS filesystem.
+  
+Let's have a closer look at each of these components.
+
+## HDFS
+
+![Alt text](/images/hdfs.png "HDFS")
+
+The above diagram illustrates an HDFS filesystem. The cluster consists of three
+DataNodes which are responsible for storing/replicating data, while the NameNode
+is a process which is responsible for storing the metadata for the entire
+filesystem. As the example illustrates above, when a client wants to write a
+file to the HDFS cluster it first contacts the NameNode and lets it know that
+it want to write a file. The NameNode then decides where and how the file
+should be saved and notifies the client about its decision.
+
+In the given example "File1" has a size of 128MB and the block size of the HDFS
+filesystem is 64 MB. Hence, the NameNode instructs the client to break down the
+file into two different blocks and write the first block to DataNode 1 and the
+second block to DataNode 2. Upon receiving the notification from the NameNode,
+the client contacts DataNode 1 and DataNode 2 directly to write the data.
+
+Once the data is recieved by the DataNodes, they replicate the block across the
+other nodes. The number of nodes across which the data would be replicated is
+based on the HDFS configuration, the default value being 3. Meanwhile the
+NameNode updates its metadata with the entry of the new file "File1" and the
+locations where the parts are stored.
+
+## MapReduce
+
+MapReduce is a Java application that utilizes the data stored in the
+HDFS filesystem to get some useful and meaningful result. The whole job is
+split into two parts: the "Map" job and the "Reduce" Job.
+
+Let's consider an example. In the previous step we had uploaded the "File1"
+into the HDFS filesystem and the file was broken down into two different
+blocks.  Let's assume that the first block had the data "black sheep" in it and
+the second block has the data "white sheep" in it. Now let's assume a client
+wants to get count of all the words occurring in "File1". In order to get the
+count, the first thing the client would have to do is write a "Map" program
+then a "Reduce" program.
+
+Here's a psudeo code of how the Map and Reduce jobs might look:
+
+		mapper (File1, file-contents):
+		for each word in file-contents:
+		emit (word, 1)
+
+		reducer (word, values):
+		sum = 0
+		for each value in values:
+		  sum = sum + value
+		emit (word, sum)
+
+The work of the Map job is to go through all the words in the file and emit
+a key/value pair. In this case the key is the word itself and value is always
+1.
+
+The Reduce job is quite simple: it increments the value of sum by 1, for each
+value it gets.
+
+Once the Map and Reduce jobs are ready, the client would instruct the
+"JobTracker" (the process resposible for scheduling the jobs on the cluster)
+to run the MapReduce job on "File1"
+		
+Let's have closer look at the anotomy of a Map job.
+    
+
+![Alt text](/images/map.png "Map job")
+
+As the figure above shows, when the client instructs the JobTracker to run a
+job on File1, the JobTracker first contacts the NameNode to determine where the
+blocks of the File1 are. Then the JobTracker sends the Map job's JAR file down
+to the nodes having the blocks, and the TaskTracker process those nodes to run
+the application.
+
+In the above example, DataNode 1 and DataNode 2 havw the blocks, so the
+TaskTrackers on those nodes run the Map jobs. Once the jobs are completed the
+two nodes would have key/value results as below:
+
+MapJob Results:
+
+		TaskTracker1:
+		"Black: 1"
+		"Sheep: 1"
+
+		TaskTracker2:
+		"White: 1"
+		"Sheep: 1"
+
+
+Once the Map phase is completed the JobTracker process initiates the Shuffle
+and Reduce process.
+
+Let's have closer look at the Shuffle-Reduce job.
+
+![Alt text](/images/reduce.png "Reduce job")
+
+As the figure above demonstrates, the first thing that the JobTracker does is
+spawn a Reducer job on the DataNode/Tasktracker nodes for each "key" in the job
+result. In this case we have three keys: "black, white, sheep" in our result,
+so three Reducers are spawned: one for each key. The Map jobs shuffle the keys
+out to the respective Reduce jobs. Then the Reduce job code runs and the sum is
+calculated, and the result is written into the HDFS filesystem in a common
+directory. In the above example the output directory is specified as
+"/home/ben/output" so all the Reducers will write their results into this
+directory under different filenames; the file names being "part-00xx", where x
+is the Reducer/partition number.
+
+
+## Hadoop Deployment
+
+![Alt text](/images/hadoop.png "Reduce job")
+
+The above diagram depicts a typical Hadoop deployment. The NameNode and
+JobTracker usually reside on the same machine, though they can run on seperate
+machines. The DataNodes and TaskTrackers run on the same node. The size of the
+cluster can be scaled to thousands of nodes with petabytes of storage.
+
+The above deployment model provides redundancy for data as the HDFS filesytem
+takes care of the data replication. The only single point of failure are the
+NameNode and the TaskTracker. If any of these components fail the cluster will
+not be usable.
+
+
+## Making Hadoop HA
+
+To make the Hadoop cluster highly available we would have to add another set of
+JobTracker/NameNodes, and make sure that the data updated by the master is also
+somehow also updated by the client. In case of failure of the primary node, the
+secondary node takes over that role.
+
+The first thing that has to be dealt with is the data held by the NameNode. As
+we recall, the NameNode holds all of the metadata about the filesystem, so any
+update to the metadata should also be reflected on the secondary NameNode's
+metadata copy. The synchronization of the primary and seconary NameNode
+metadata is handled by the Quorum Journal Manager.
+
+
+### Quorum Journal Manager
+
+![Alt text](/images/qjm.png "QJM")
+
+As the figure above shows the Quorum Journal manager consists of the journal
+manager client and journal manager nodes. The journal manager clients reside
+on the same node as the NameNodes, and in case of primary node, collects all the
+edits logs happening on the NameNode and sends it out to the Journal nodes. The
+journal manager client residing on the secondary namenode regurlary contacts
+the journal nodes and updates its local metadata to be consistant with the
+master node. In case of primary node failure the secondary NameNode updates
+itself to the latest edit logs and takes over as the primary NameNode.
+
+
+### Zookeeper
+
+Apart from data consistency, a distributed cluster system also needs a
+mechanism for centralized coordination. For example, there should be a way for
+the secondary node to tell if the primary node is running properly, and if not
+it has to take up the role of the primary. Zookeeper provides Hadoop with a
+mechanism to coordinate in this way.
+
+![Alt text](/images/zookeeper.png "Zookeeper")
+
+As the figure above shows, the Zookeeper services are client/server baseds
+service. The server component itself is replicated over a set of machines that
+comprise the service. In short, high availability is built into the Zookeeper
+servers.
+
+For Hadoop, two Zookeeper clients have been built: ZKFC (Zookeeper Failover
+Controller), one for the NameNode and one for JobTracker. These clients run on
+the same machines as the NameNode/JobTrackers themselves.
+
+When a ZKFC client is started, it establishes a connection with one of the
+Zookeeper nodes and obtains a session ID. The client then keeps a health check
+on the NameNode/JobTracker and sends heartbeats to the ZooKeeper.
+
+If the ZKFC client detects a failure of the NameNode/JobTracker, it removes
+itself from the ZooKeeper active/standby election, and the other ZKFC client
+fences the node/service and takes over the primary role.
+
+
+## Hadoop HA Deployment
+
+![Alt text](/images/hadoopha.png "Hadoop_HA")
+
+The above diagram depicts a fully HA Hadoop Cluster with no single point of
+failure and automated failover.
+
+
+## Deploying Hadoop Clusters with Ansible
+
+Setting up a Hadoop cluster without HA itself can be a challenging and
+time-consuming task, and with HA, things become even more difficult.
+
+Ansible can automate the whole process of deploying a Hadoop cluster with or
+without HA with the same playbook, in a matter of minutes. This can be used for
+quick environment rebuild, or in case of disaster or node failures, recovery
+time can be greatly reduced with Ansible automation.
+
+Let's have a look to see how this is done.
+
+## Deploying a Hadoop cluster with HA
+
+### Prerequisites
+
+These playbooks have been tested using Ansible v1.2, and CentOS 6.x (64 bit)
+
+Modify group_vars/all to choose the network interface for Hadoop communication.
+
+Optionally you change the Hadoop-specific parameters like ports or directories
+by editing group_vars/all file.
+
+Before launching the deployment playbook make sure the inventory file (hosts)
+is set up properly. Here's a sample: 
+
+		[hadoop_master_primary]
+		zhadoop1
+
+		[hadoop_master_secondary]
+		zhadoop2
+
+		[hadoop_masters:children]
+		hadoop_master_primary
+		hadoop_master_secondary
+
+		[hadoop_slaves]
+		hadoop1
+		hadoop2
+		hadoop3
+
+		[qjournal_servers]
+		zhadoop1
+		zhadoop2
+		zhadoop3
+
+		[zookeeper_servers]
+		zhadoop1 zoo_id=1
+		zhadoop2 zoo_id=2
+		zhadoop3 zoo_id=3 
+
+Once the inventory is set up, the Hadoop cluster can be setup using the following
+command
+
+		ansible-playbook -i hosts site.yml
+
+Once deployed, we can check the cluster sanity in different ways. To check the
+status of the HDFS filesystem and a report on all the DataNodes, log in as the
+'hdfs' user on any Hadoop master server, and issue the following command to get
+the report:
+
+		hadoop dfsadmin -report
+
+To check the sanity of HA, first log in as the 'hdfs' user on any Hadoop master
+server and get the current active/standby NameNode servers this way:
+
+		-bash-4.1$ hdfs haadmin -getServiceState zhadoop1
+			active
+		-bash-4.1$ hdfs haadmin -getServiceState zhadoop2
+			standby
+
+To get the state of the JobTracker process login as the 'mapred' user on any
+Hadoop master server and issue the following command:
+
+		-bash-4.1$ hadoop mrhaadmin -getServiceState hadoop1
+			standby
+		-bash-4.1$ hadoop mrhaadmin -getServiceState hadoop2
+			active
+
+Once you have determined which server is active and which is standby, you can
+kill the NameNode/JobTracker process on the server listed as active and issue
+the same commands again, and you should see that the standby has been promoted
+to the active state. Later, you can restart the killed process and see that node
+listed as standby.
+
+### Running a MapReduce Job
+
+To deploy the mapreduce job run the following script from any of the hadoop
+master nodes as user 'hdfs'. The job would count the number of occurance of the
+word 'hello' in the given inputfile. Eg: su - hdfs -c "/tmp/job.sh"
+
+		#!/bin/bash		
+		cat > /tmp/inputfile << EOF
+		hello
+		sf
+		sdf
+		hello
+		sdf
+		sdf
+		EOF
+		hadoop fs -put /tmp/inputfile /inputfile
+		hadoop jar /usr/lib/hadoop-0.20-mapreduce/hadoop-examples.jar grep /inputfile /outputfile 'hello'
+		hadoop fs -get /outputfile /tmp/outputfile/
+
+To verify the result, read the file on the server located at
+/tmp/outputfile/part-00000, which should give you the count.
+
+## Scale the Cluster
+
+When the Hadoop cluster reaches its maximum capacity, it can be scaled by
+adding nodes.  This can be easily accomplished by adding the node hostname to
+the Ansible inventory under the hadoop_slaves group, and running the following
+command:
+
+		ansible-playbook -i hosts site.yml --tags=slaves
+
+## Deploy a non-HA Hadoop Cluster
+
+The following diagram illustrates a standalone Hadoop cluster.
+
+To deploy this cluster fill in the inventory file as follows: 
+
+		[hadoop_master_primary]
+		zhadoop1
+
+		[hadoop_master_secondary]
+
+		[hadoop_masters:children]
+		hadoop_master_primary
+		hadoop_master_secondary
+
+		[hadoop_slaves]
+		hadoop1
+		hadoop2
+		hadoop3
+
+Edit the group_vars/all file to disable HA:
+
+		ha_enabled: False
+
+And run the following command:
+
+		ansible-playbook -i hosts site.yml 
+
+The validity of the cluster can be checked by running the same MapReduce job
+that has documented above for an HA Hadoop cluster.
+
--- a/hadoop/group_vars/all
+++ b/hadoop/group_vars/all
@ -0,0 +1,56 @@
+# Defaults to the first ethernet interface. Change this to:
+#
+#  iface: eth1
+#
+# ...to override.
+#
+iface: '{{ ansible_default_ipv4.interface }}'
+
+ha_enabled: False 
+
+hadoop:
+    
+#Variables for <core-site_xml> - common
+    
+    fs_default_FS_port: 8020
+    nameservice_id: mycluster4
+
+#Variables for <hdfs-site_xml>     
+    
+    dfs_permissions_superusergroup: hdfs
+    dfs_namenode_name_dir: 
+       - /namedir1/
+       - /namedir2/
+    dfs_replication: 3
+    dfs_namenode_handler_count: 50
+    dfs_blocksize: 67108864
+    dfs_datanode_data_dir:
+       - /datadir1/
+       - /datadir2/
+    dfs_datanode_address_port: 50010
+    dfs_datanode_http_address_port: 50075
+    dfs_datanode_ipc_address_port:  50020
+    dfs_namenode_http_address_port: 50070
+    dfs_ha_zkfc_port: 8019
+    qjournal_port: 8485  
+    qjournal_http_port: 8480 
+    dfs_journalnode_edits_dir: /journaldir/
+    zookeeper_clientport: 2181
+    zookeeper_leader_port: 2888
+    zookeeper_election_port: 3888
+
+#Variables for <mapred-site_xml> - common
+    mapred_job_tracker_ha_servicename: myjt4 
+    mapred_job_tracker_http_address_port: 50030
+    mapred_task_tracker_http_address_port: 50060 
+    mapred_job_tracker_port: 8021
+    mapred_ha_jobtracker_rpc-address_port: 8023
+    mapred_ha_zkfc_port: 8018
+    mapred_job_tracker_persist_jobstatus_dir: /jobdir/
+    mapred_local_dir:
+      - /mapred1/
+      - /mapred2/
+
+
+     
+
--- a/hadoop/hosts
+++ b/hadoop/hosts
@ -0,0 +1,31 @@
+[hadoop_all:children]
+hadoop_masters
+hadoop_slaves
+qjournal_servers
+zookeeper_servers
+
+[hadoop_master_primary]
+hadoop1
+
+[hadoop_master_secondary]
+hadoop2
+
+[hadoop_masters:children]
+hadoop_master_primary
+hadoop_master_secondary
+
+[hadoop_slaves]
+hadoop1
+hadoop2
+hadoop3
+
+[qjournal_servers]
+hadoop1
+hadoop2
+hadoop3
+
+[zookeeper_servers]
+hadoop1 zoo_id=1
+hadoop2 zoo_id=2
+hadoop3 zoo_id=3
+
--- a/hadoop/images/hadoop.png
+++ b/hadoop/images/hadoop.png
--- a/hadoop/images/hadoopha.png
+++ b/hadoop/images/hadoopha.png
--- a/hadoop/images/hdfs.png
+++ b/hadoop/images/hdfs.png
--- a/hadoop/images/map.png
+++ b/hadoop/images/map.png
--- a/hadoop/images/qjm.png
+++ b/hadoop/images/qjm.png
--- a/hadoop/images/reduce.png
+++ b/hadoop/images/reduce.png
--- a/hadoop/images/zookeeper.png
+++ b/hadoop/images/zookeeper.png
--- a/hadoop/playbooks/inputfile
+++ b/hadoop/playbooks/inputfile
@ -0,0 +1,19 @@
+asdf
+sdf
+sdf
+sd
+f
+sf
+sdf
+sd
+fsd
+hello
+asf
+sf
+sd
+fsd
+f
+sdf
+sd
+hello
+
--- a/hadoop/playbooks/job.yml
+++ b/hadoop/playbooks/job.yml
@ -0,0 +1,21 @@
+---
+# Launch Job to count occurance of a word.
+
+- hosts: $server
+  user: root
+  tasks:
+   - name: copy the file 
+     copy: src=inputfile dest=/tmp/inputfile 
+ 
+   - name: upload the file
+     shell: su - hdfs -c "hadoop fs -put /tmp/inputfile /inputfile"
+
+   - name: Run the MapReduce job to count the occurance of word hello
+     shell: su - hdfs -c "hadoop jar /usr/lib/hadoop-0.20-mapreduce/hadoop-examples.jar grep /inputfile /outputfile 'hello'"
+
+   - name: Fetch the outputfile to local tmp dir
+     shell: su - hdfs -c "hadoop fs -get /outputfile /tmp/outputfile"
+
+   - name: Get the outputfile to ansible server
+     fetch: dest=/tmp src=/tmp/outputfile/part-00000
+ 
--- a/hadoop/roles/common/files/etc/cloudera-CDH4.repo
+++ b/hadoop/roles/common/files/etc/cloudera-CDH4.repo
@ -0,0 +1,5 @@
+[cloudera-cdh4]
+name=Cloudera's Distribution for Hadoop, Version 4
+baseurl=http://archive.cloudera.com/cdh4/redhat/6/x86_64/cdh/4/
+gpgkey = http://archive.cloudera.com/cdh4/redhat/6/x86_64/cdh/RPM-GPG-KEY-cloudera    
+gpgcheck = 1
--- a/hadoop/roles/common/handlers/main.yml
+++ b/hadoop/roles/common/handlers/main.yml
@ -0,0 +1,2 @@
+- name: restart iptables
+  service: name=iptables state=restarted
--- a/hadoop/roles/common/tasks/common.yml
+++ b/hadoop/roles/common/tasks/common.yml
@ -0,0 +1,28 @@
+---
+# The playbook for common tasks
+
+- name: Deploy the Cloudera Repository
+  copy: src=etc/cloudera-CDH4.repo dest=/etc/yum.repos.d/cloudera-CDH4.repo
+
+- name: Install the openjdk package
+  yum: name=java-1.6.0-openjdk state=installed
+
+- name: Create a directory for java 
+  file: state=directory path=/usr/java/
+
+- name: Create a link for java
+  file: src=/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre state=link path=/usr/java/default
+
+- name: Create the hosts file for all machines
+  template: src=etc/hosts.j2 dest=/etc/hosts
+
+- name: Disable SELinux in conf file
+  lineinfile: dest=/etc/sysconfig/selinux regexp='^SELINUX=' line='SELINUX=disabled' state=present
+
+- name: Disable SELinux dynamically
+  shell: creates=/etc/sysconfig/selinux.disabled setenforce 0 ; touch /etc/sysconfig/selinux.disabled
+
+- name: Create the iptables file for all machines
+  template: src=iptables.j2 dest=/etc/sysconfig/iptables
+  notify: restart iptables
+
--- a/hadoop/roles/common/tasks/main.yml
+++ b/hadoop/roles/common/tasks/main.yml
@ -0,0 +1,5 @@
+---
+# The playbook for common tasks
+
+- include: common.yml tags=slaves
+
--- a/hadoop/roles/common/templates/etc/hosts.j2
+++ b/hadoop/roles/common/templates/etc/hosts.j2
@ -0,0 +1,5 @@
+127.0.0.1 localhost
+{% for host in groups.all %}
+{{ hostvars[host]['ansible_' + iface].ipv4.address }}  {{ host }}
+{% endfor %}
+
--- a/hadoop/roles/common/templates/hadoop_conf/core-site.xml.j2
+++ b/hadoop/roles/common/templates/hadoop_conf/core-site.xml.j2
@ -0,0 +1,25 @@
+<?xml version="1.0"?>
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
+
+<configuration>
+<property>
+      <name>fs.defaultFS</name>
+      <value>hdfs://{{ hostvars[groups['hadoop_masters'][0]]['ansible_hostname'] + ':' ~ hadoop['fs_default_FS_port'] }}/</value>
+  </property>
+</configuration>
--- a/hadoop/roles/common/templates/hadoop_conf/hadoop-metrics.properties.j2
+++ b/hadoop/roles/common/templates/hadoop_conf/hadoop-metrics.properties.j2
@ -0,0 +1,75 @@
+# Configuration of the "dfs" context for null
+dfs.class=org.apache.hadoop.metrics.spi.NullContext
+
+# Configuration of the "dfs" context for file
+#dfs.class=org.apache.hadoop.metrics.file.FileContext
+#dfs.period=10
+#dfs.fileName=/tmp/dfsmetrics.log
+
+# Configuration of the "dfs" context for ganglia
+# Pick one: Ganglia 3.0 (former) or Ganglia 3.1 (latter)
+# dfs.class=org.apache.hadoop.metrics.ganglia.GangliaContext
+# dfs.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
+# dfs.period=10
+# dfs.servers=localhost:8649
+
+
+# Configuration of the "mapred" context for null
+mapred.class=org.apache.hadoop.metrics.spi.NullContext
+
+# Configuration of the "mapred" context for file
+#mapred.class=org.apache.hadoop.metrics.file.FileContext
+#mapred.period=10
+#mapred.fileName=/tmp/mrmetrics.log
+
+# Configuration of the "mapred" context for ganglia
+# Pick one: Ganglia 3.0 (former) or Ganglia 3.1 (latter)
+# mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext
+# mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
+# mapred.period=10
+# mapred.servers=localhost:8649
+
+
+# Configuration of the "jvm" context for null
+#jvm.class=org.apache.hadoop.metrics.spi.NullContext
+
+# Configuration of the "jvm" context for file
+#jvm.class=org.apache.hadoop.metrics.file.FileContext
+#jvm.period=10
+#jvm.fileName=/tmp/jvmmetrics.log
+
+# Configuration of the "jvm" context for ganglia
+# jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext
+# jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
+# jvm.period=10
+# jvm.servers=localhost:8649
+
+# Configuration of the "rpc" context for null
+rpc.class=org.apache.hadoop.metrics.spi.NullContext
+
+# Configuration of the "rpc" context for file
+#rpc.class=org.apache.hadoop.metrics.file.FileContext
+#rpc.period=10
+#rpc.fileName=/tmp/rpcmetrics.log
+
+# Configuration of the "rpc" context for ganglia
+# rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContext
+# rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
+# rpc.period=10
+# rpc.servers=localhost:8649
+
+
+# Configuration of the "ugi" context for null
+ugi.class=org.apache.hadoop.metrics.spi.NullContext
+
+# Configuration of the "ugi" context for file
+#ugi.class=org.apache.hadoop.metrics.file.FileContext
+#ugi.period=10
+#ugi.fileName=/tmp/ugimetrics.log
+
+# Configuration of the "ugi" context for ganglia
+# ugi.class=org.apache.hadoop.metrics.ganglia.GangliaContext
+# ugi.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
+# ugi.period=10
+# ugi.servers=localhost:8649
+
--- a/hadoop/roles/common/templates/hadoop_conf/hadoop-metrics2.properties.j2
+++ b/hadoop/roles/common/templates/hadoop_conf/hadoop-metrics2.properties.j2
@ -0,0 +1,44 @@
+#
+#   Licensed to the Apache Software Foundation (ASF) under one or more
+#   contributor license agreements.  See the NOTICE file distributed with
+#   this work for additional information regarding copyright ownership.
+#   The ASF licenses this file to You under the Apache License, Version 2.0
+#   (the "License"); you may not use this file except in compliance with
+#   the License.  You may obtain a copy of the License at
+#
+#       http://www.apache.org/licenses/LICENSE-2.0
+#
+#   Unless required by applicable law or agreed to in writing, software
+#   distributed under the License is distributed on an "AS IS" BASIS,
+#   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#   See the License for the specific language governing permissions and
+#   limitations under the License.
+#
+
+# syntax: [prefix].[source|sink].[instance].[options]
+# See javadoc of package-info.java for org.apache.hadoop.metrics2 for details
+
+*.sink.file.class=org.apache.hadoop.metrics2.sink.FileSink
+# default sampling period, in seconds
+*.period=10
+
+# The namenode-metrics.out will contain metrics from all context
+#namenode.sink.file.filename=namenode-metrics.out
+# Specifying a special sampling period for namenode:
+#namenode.sink.*.period=8
+
+#datanode.sink.file.filename=datanode-metrics.out
+
+# the following example split metrics of different
+# context to different sinks (in this case files)
+#jobtracker.sink.file_jvm.context=jvm
+#jobtracker.sink.file_jvm.filename=jobtracker-jvm-metrics.out
+#jobtracker.sink.file_mapred.context=mapred
+#jobtracker.sink.file_mapred.filename=jobtracker-mapred-metrics.out
+
+#tasktracker.sink.file.filename=tasktracker-metrics.out
+
+#maptask.sink.file.filename=maptask-metrics.out
+
+#reducetask.sink.file.filename=reducetask-metrics.out
+
--- a/hadoop/roles/common/templates/hadoop_conf/hdfs-site.xml.j2
+++ b/hadoop/roles/common/templates/hadoop_conf/hdfs-site.xml.j2
@ -0,0 +1,57 @@
+<?xml version="1.0"?>
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
+
+<configuration>
+  <property>
+    <name>dfs.blocksize</name>
+    <value>{{ hadoop['dfs_blocksize'] }}</value>
+  </property>
+  <property>
+    <name>dfs.permissions.superusergroup</name>
+    <value>{{ hadoop['dfs_permissions_superusergroup'] }}</value>
+  </property>
+  <property>
+    <name>dfs.namenode.http.address</name>
+    <value>0.0.0.0:{{ hadoop['dfs_namenode_http_address_port'] }}</value>
+  </property>
+   <property>
+    <name>dfs.datanode.address</name>
+    <value>0.0.0.0:{{ hadoop['dfs_datanode_address_port'] }}</value>
+  </property>
+  <property>
+    <name>dfs.datanode.http.address</name>
+    <value>0.0.0.0:{{ hadoop['dfs_datanode_http_address_port'] }}</value>
+  </property>
+  <property>
+    <name>dfs.datanode.ipc.address</name>
+    <value>0.0.0.0:{{ hadoop['dfs_datanode_ipc_address_port'] }}</value>
+  </property>
+  <property>
+    <name>dfs.replication</name>
+    <value>{{ hadoop['dfs_replication'] }}</value>
+  </property>
+  <property>
+    <name>dfs.namenode.name.dir</name>
+    <value>{{ hadoop['dfs_namenode_name_dir'] | join(',') }}</value>
+  </property>
+  <property>
+    <name>dfs.datanode.data.dir</name>
+    <value>{{ hadoop['dfs_datanode_data_dir'] | join(',') }}</value>
+  </property>
+</configuration>
--- a/hadoop/roles/common/templates/hadoop_conf/log4j.properties.j2
+++ b/hadoop/roles/common/templates/hadoop_conf/log4j.properties.j2
@ -0,0 +1,219 @@
+# Copyright 2011 The Apache Software Foundation
+# 
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Define some default values that can be overridden by system properties
+hadoop.root.logger=INFO,console
+hadoop.log.dir=.
+hadoop.log.file=hadoop.log
+
+# Define the root logger to the system property "hadoop.root.logger".
+log4j.rootLogger=${hadoop.root.logger}, EventCounter
+
+# Logging Threshold
+log4j.threshold=ALL
+
+# Null Appender
+log4j.appender.NullAppender=org.apache.log4j.varia.NullAppender
+
+#
+# Rolling File Appender - cap space usage at 5gb.
+#
+hadoop.log.maxfilesize=256MB
+hadoop.log.maxbackupindex=20
+log4j.appender.RFA=org.apache.log4j.RollingFileAppender
+log4j.appender.RFA.File=${hadoop.log.dir}/${hadoop.log.file}
+
+log4j.appender.RFA.MaxFileSize=${hadoop.log.maxfilesize}
+log4j.appender.RFA.MaxBackupIndex=${hadoop.log.maxbackupindex}
+
+log4j.appender.RFA.layout=org.apache.log4j.PatternLayout
+
+# Pattern format: Date LogLevel LoggerName LogMessage
+log4j.appender.RFA.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n
+# Debugging Pattern format
+#log4j.appender.RFA.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} (%F:%M(%L)) - %m%n
+
+
+#
+# Daily Rolling File Appender
+#
+
+log4j.appender.DRFA=org.apache.log4j.DailyRollingFileAppender
+log4j.appender.DRFA.File=${hadoop.log.dir}/${hadoop.log.file}
+
+# Rollver at midnight
+log4j.appender.DRFA.DatePattern=.yyyy-MM-dd
+
+# 30-day backup
+#log4j.appender.DRFA.MaxBackupIndex=30
+log4j.appender.DRFA.layout=org.apache.log4j.PatternLayout
+
+# Pattern format: Date LogLevel LoggerName LogMessage
+log4j.appender.DRFA.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n
+# Debugging Pattern format
+#log4j.appender.DRFA.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} (%F:%M(%L)) - %m%n
+
+
+#
+# console
+# Add "console" to rootlogger above if you want to use this 
+#
+
+log4j.appender.console=org.apache.log4j.ConsoleAppender
+log4j.appender.console.target=System.err
+log4j.appender.console.layout=org.apache.log4j.PatternLayout
+log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{2}: %m%n
+
+#
+# TaskLog Appender
+#
+
+#Default values
+hadoop.tasklog.taskid=null
+hadoop.tasklog.iscleanup=false
+hadoop.tasklog.noKeepSplits=4
+hadoop.tasklog.totalLogFileSize=100
+hadoop.tasklog.purgeLogSplits=true
+hadoop.tasklog.logsRetainHours=12
+
+log4j.appender.TLA=org.apache.hadoop.mapred.TaskLogAppender
+log4j.appender.TLA.taskId=${hadoop.tasklog.taskid}
+log4j.appender.TLA.isCleanup=${hadoop.tasklog.iscleanup}
+log4j.appender.TLA.totalLogFileSize=${hadoop.tasklog.totalLogFileSize}
+
+log4j.appender.TLA.layout=org.apache.log4j.PatternLayout
+log4j.appender.TLA.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n
+
+#
+# HDFS block state change log from block manager
+#
+# Uncomment the following to suppress normal block state change
+# messages from BlockManager in NameNode.
+#log4j.logger.BlockStateChange=WARN
+
+#
+#Security appender
+#
+hadoop.security.logger=INFO,NullAppender
+hadoop.security.log.maxfilesize=256MB
+hadoop.security.log.maxbackupindex=20
+log4j.category.SecurityLogger=${hadoop.security.logger}
+hadoop.security.log.file=SecurityAuth-${user.name}.audit
+log4j.appender.RFAS=org.apache.log4j.RollingFileAppender 
+log4j.appender.RFAS.File=${hadoop.log.dir}/${hadoop.security.log.file}
+log4j.appender.RFAS.layout=org.apache.log4j.PatternLayout
+log4j.appender.RFAS.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n
+log4j.appender.RFAS.MaxFileSize=${hadoop.security.log.maxfilesize}
+log4j.appender.RFAS.MaxBackupIndex=${hadoop.security.log.maxbackupindex}
+
+#
+# Daily Rolling Security appender
+#
+log4j.appender.DRFAS=org.apache.log4j.DailyRollingFileAppender 
+log4j.appender.DRFAS.File=${hadoop.log.dir}/${hadoop.security.log.file}
+log4j.appender.DRFAS.layout=org.apache.log4j.PatternLayout
+log4j.appender.DRFAS.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n
+log4j.appender.DRFAS.DatePattern=.yyyy-MM-dd
+
+#
+# hdfs audit logging
+#
+hdfs.audit.logger=INFO,NullAppender
+hdfs.audit.log.maxfilesize=256MB
+hdfs.audit.log.maxbackupindex=20
+log4j.logger.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=${hdfs.audit.logger}
+log4j.additivity.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=false
+log4j.appender.RFAAUDIT=org.apache.log4j.RollingFileAppender
+log4j.appender.RFAAUDIT.File=${hadoop.log.dir}/hdfs-audit.log
+log4j.appender.RFAAUDIT.layout=org.apache.log4j.PatternLayout
+log4j.appender.RFAAUDIT.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n
+log4j.appender.RFAAUDIT.MaxFileSize=${hdfs.audit.log.maxfilesize}
+log4j.appender.RFAAUDIT.MaxBackupIndex=${hdfs.audit.log.maxbackupindex}
+
+#
+# mapred audit logging
+#
+mapred.audit.logger=INFO,NullAppender
+mapred.audit.log.maxfilesize=256MB
+mapred.audit.log.maxbackupindex=20
+log4j.logger.org.apache.hadoop.mapred.AuditLogger=${mapred.audit.logger}
+log4j.additivity.org.apache.hadoop.mapred.AuditLogger=false
+log4j.appender.MRAUDIT=org.apache.log4j.RollingFileAppender
+log4j.appender.MRAUDIT.File=${hadoop.log.dir}/mapred-audit.log
+log4j.appender.MRAUDIT.layout=org.apache.log4j.PatternLayout
+log4j.appender.MRAUDIT.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n
+log4j.appender.MRAUDIT.MaxFileSize=${mapred.audit.log.maxfilesize}
+log4j.appender.MRAUDIT.MaxBackupIndex=${mapred.audit.log.maxbackupindex}
+
+# Custom Logging levels
+
+#log4j.logger.org.apache.hadoop.mapred.JobTracker=DEBUG
+#log4j.logger.org.apache.hadoop.mapred.TaskTracker=DEBUG
+#log4j.logger.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=DEBUG
+
+# Jets3t library
+log4j.logger.org.jets3t.service.impl.rest.httpclient.RestS3Service=ERROR
+
+#
+# Event Counter Appender
+# Sends counts of logging messages at different severity levels to Hadoop Metrics.
+#
+log4j.appender.EventCounter=org.apache.hadoop.log.metrics.EventCounter
+
+#
+# Job Summary Appender 
+#
+# Use following logger to send summary to separate file defined by 
+# hadoop.mapreduce.jobsummary.log.file :
+# hadoop.mapreduce.jobsummary.logger=INFO,JSA
+# 
+hadoop.mapreduce.jobsummary.logger=${hadoop.root.logger}
+hadoop.mapreduce.jobsummary.log.file=hadoop-mapreduce.jobsummary.log
+hadoop.mapreduce.jobsummary.log.maxfilesize=256MB
+hadoop.mapreduce.jobsummary.log.maxbackupindex=20
+log4j.appender.JSA=org.apache.log4j.RollingFileAppender
+log4j.appender.JSA.File=${hadoop.log.dir}/${hadoop.mapreduce.jobsummary.log.file}
+log4j.appender.JSA.MaxFileSize=${hadoop.mapreduce.jobsummary.log.maxfilesize}
+log4j.appender.JSA.MaxBackupIndex=${hadoop.mapreduce.jobsummary.log.maxbackupindex}
+log4j.appender.JSA.layout=org.apache.log4j.PatternLayout
+log4j.appender.JSA.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{2}: %m%n
+log4j.logger.org.apache.hadoop.mapred.JobInProgress$JobSummary=${hadoop.mapreduce.jobsummary.logger}
+log4j.additivity.org.apache.hadoop.mapred.JobInProgress$JobSummary=false
+
+#
+# Yarn ResourceManager Application Summary Log 
+#
+# Set the ResourceManager summary log filename
+#yarn.server.resourcemanager.appsummary.log.file=rm-appsummary.log
+# Set the ResourceManager summary log level and appender
+#yarn.server.resourcemanager.appsummary.logger=INFO,RMSUMMARY
+
+# Appender for ResourceManager Application Summary Log
+# Requires the following properties to be set
+#    - hadoop.log.dir (Hadoop Log directory)
+#    - yarn.server.resourcemanager.appsummary.log.file (resource manager app summary log filename)
+#    - yarn.server.resourcemanager.appsummary.logger (resource manager app summary log level and appender)
+
+#log4j.logger.org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary=${yarn.server.resourcemanager.appsummary.logger}
+#log4j.additivity.org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary=false
+#log4j.appender.RMSUMMARY=org.apache.log4j.RollingFileAppender
+#log4j.appender.RMSUMMARY.File=${hadoop.log.dir}/${yarn.server.resourcemanager.appsummary.log.file}
+#log4j.appender.RMSUMMARY.MaxFileSize=256MB
+#log4j.appender.RMSUMMARY.MaxBackupIndex=20
+#log4j.appender.RMSUMMARY.layout=org.apache.log4j.PatternLayout
+#log4j.appender.RMSUMMARY.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n
--- a/hadoop/roles/common/templates/hadoop_conf/mapred-site.xml.j2
+++ b/hadoop/roles/common/templates/hadoop_conf/mapred-site.xml.j2
@ -0,0 +1,22 @@
+<configuration>
+  
+  <property>
+    <name>mapred.job.tracker</name>
+    <value>{{ hostvars[groups['hadoop_masters'][0]]['ansible_hostname'] }}:{{ hadoop['mapred_job_tracker_port'] }}</value>
+  </property>
+
+  <property>
+    <name>mapred.local.dir</name>
+    <value>{{ hadoop["mapred_local_dir"] | join(',') }}</value>
+  </property>
+  
+  <property>
+    <name>mapred.task.tracker.http.address</name>
+    <value>0.0.0.0:{{ hadoop['mapred_task_tracker_http_address_port'] }}</value>
+  </property>
+  <property>
+    <name>mapred.job.tracker.http.address</name>
+    <value>0.0.0.0:{{ hadoop['mapred_job_tracker_http_address_port'] }}</value>
+  </property>
+
+</configuration>
--- a/hadoop/roles/common/templates/hadoop_conf/slaves.j2
+++ b/hadoop/roles/common/templates/hadoop_conf/slaves.j2
@ -0,0 +1,3 @@
+{% for host in groups['hadoop_slaves'] %}
+{{ host }}
+{% endfor %}
--- a/hadoop/roles/common/templates/hadoop_conf/ssl-client.xml.example.j2
+++ b/hadoop/roles/common/templates/hadoop_conf/ssl-client.xml.example.j2
@ -0,0 +1,80 @@
+<?xml version="1.0"?>
+<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+<configuration>
+
+<property>
+  <name>ssl.client.truststore.location</name>
+  <value></value>
+  <description>Truststore to be used by clients like distcp. Must be
+  specified.
+  </description>
+</property>
+
+<property>
+  <name>ssl.client.truststore.password</name>
+  <value></value>
+  <description>Optional. Default value is "".
+  </description>
+</property>
+
+<property>
+  <name>ssl.client.truststore.type</name>
+  <value>jks</value>
+  <description>Optional. The keystore file format, default value is "jks".
+  </description>
+</property>
+
+<property>
+  <name>ssl.client.truststore.reload.interval</name>
+  <value>10000</value>
+  <description>Truststore reload check interval, in milliseconds.
+  Default value is 10000 (10 seconds).
+  </description>
+</property>
+
+<property>
+  <name>ssl.client.keystore.location</name>
+  <value></value>
+  <description>Keystore to be used by clients like distcp. Must be
+  specified.
+  </description>
+</property>
+
+<property>
+  <name>ssl.client.keystore.password</name>
+  <value></value>
+  <description>Optional. Default value is "".
+  </description>
+</property>
+
+<property>
+  <name>ssl.client.keystore.keypassword</name>
+  <value></value>
+  <description>Optional. Default value is "".
+  </description>
+</property>
+
+<property>
+  <name>ssl.client.keystore.type</name>
+  <value>jks</value>
+  <description>Optional. The keystore file format, default value is "jks".
+  </description>
+</property>
+
+</configuration>
--- a/hadoop/roles/common/templates/hadoop_conf/ssl-server.xml.example.j2
+++ b/hadoop/roles/common/templates/hadoop_conf/ssl-server.xml.example.j2
@ -0,0 +1,77 @@
+<?xml version="1.0"?>
+<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+<configuration>
+
+<property>
+  <name>ssl.server.truststore.location</name>
+  <value></value>
+  <description>Truststore to be used by NN and DN. Must be specified.
+  </description>
+</property>
+
+<property>
+  <name>ssl.server.truststore.password</name>
+  <value></value>
+  <description>Optional. Default value is "".
+  </description>
+</property>
+
+<property>
+  <name>ssl.server.truststore.type</name>
+  <value>jks</value>
+  <description>Optional. The keystore file format, default value is "jks".
+  </description>
+</property>
+
+<property>
+  <name>ssl.server.truststore.reload.interval</name>
+  <value>10000</value>
+  <description>Truststore reload check interval, in milliseconds.
+  Default value is 10000 (10 seconds).
+</property>
+
+<property>
+  <name>ssl.server.keystore.location</name>
+  <value></value>
+  <description>Keystore to be used by NN and DN. Must be specified.
+  </description>
+</property>
+
+<property>
+  <name>ssl.server.keystore.password</name>
+  <value></value>
+  <description>Must be specified.
+  </description>
+</property>
+
+<property>
+  <name>ssl.server.keystore.keypassword</name>
+  <value></value>
+  <description>Must be specified.
+  </description>
+</property>
+
+<property>
+  <name>ssl.server.keystore.type</name>
+  <value>jks</value>
+  <description>Optional. The keystore file format, default value is "jks".
+  </description>
+</property>
+
+</configuration>
--- a/hadoop/roles/common/templates/hadoop_ha_conf/core-site.xml.j2
+++ b/hadoop/roles/common/templates/hadoop_ha_conf/core-site.xml.j2
@ -0,0 +1,25 @@
+<?xml version="1.0"?>
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
+
+<configuration>
+<property>
+      <name>fs.defaultFS</name>
+      <value>hdfs://{{ hadoop['nameservice_id'] }}/</value>
+  </property>
+</configuration>
--- a/hadoop/roles/common/templates/hadoop_ha_conf/hadoop-metrics.properties.j2
+++ b/hadoop/roles/common/templates/hadoop_ha_conf/hadoop-metrics.properties.j2
@ -0,0 +1,75 @@
+# Configuration of the "dfs" context for null
+dfs.class=org.apache.hadoop.metrics.spi.NullContext
+
+# Configuration of the "dfs" context for file
+#dfs.class=org.apache.hadoop.metrics.file.FileContext
+#dfs.period=10
+#dfs.fileName=/tmp/dfsmetrics.log
+
+# Configuration of the "dfs" context for ganglia
+# Pick one: Ganglia 3.0 (former) or Ganglia 3.1 (latter)
+# dfs.class=org.apache.hadoop.metrics.ganglia.GangliaContext
+# dfs.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
+# dfs.period=10
+# dfs.servers=localhost:8649
+
+
+# Configuration of the "mapred" context for null
+mapred.class=org.apache.hadoop.metrics.spi.NullContext
+
+# Configuration of the "mapred" context for file
+#mapred.class=org.apache.hadoop.metrics.file.FileContext
+#mapred.period=10
+#mapred.fileName=/tmp/mrmetrics.log
+
+# Configuration of the "mapred" context for ganglia
+# Pick one: Ganglia 3.0 (former) or Ganglia 3.1 (latter)
+# mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext
+# mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
+# mapred.period=10
+# mapred.servers=localhost:8649
+
+
+# Configuration of the "jvm" context for null
+#jvm.class=org.apache.hadoop.metrics.spi.NullContext
+
+# Configuration of the "jvm" context for file
+#jvm.class=org.apache.hadoop.metrics.file.FileContext
+#jvm.period=10
+#jvm.fileName=/tmp/jvmmetrics.log
+
+# Configuration of the "jvm" context for ganglia
+# jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext
+# jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
+# jvm.period=10
+# jvm.servers=localhost:8649
+
+# Configuration of the "rpc" context for null
+rpc.class=org.apache.hadoop.metrics.spi.NullContext
+
+# Configuration of the "rpc" context for file
+#rpc.class=org.apache.hadoop.metrics.file.FileContext
+#rpc.period=10
+#rpc.fileName=/tmp/rpcmetrics.log
+
+# Configuration of the "rpc" context for ganglia
+# rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContext
+# rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
+# rpc.period=10
+# rpc.servers=localhost:8649
+
+
+# Configuration of the "ugi" context for null
+ugi.class=org.apache.hadoop.metrics.spi.NullContext
+
+# Configuration of the "ugi" context for file
+#ugi.class=org.apache.hadoop.metrics.file.FileContext
+#ugi.period=10
+#ugi.fileName=/tmp/ugimetrics.log
+
+# Configuration of the "ugi" context for ganglia
+# ugi.class=org.apache.hadoop.metrics.ganglia.GangliaContext
+# ugi.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
+# ugi.period=10
+# ugi.servers=localhost:8649
+
--- a/hadoop/roles/common/templates/hadoop_ha_conf/hadoop-metrics2.properties.j2
+++ b/hadoop/roles/common/templates/hadoop_ha_conf/hadoop-metrics2.properties.j2
@ -0,0 +1,44 @@
+#
+#   Licensed to the Apache Software Foundation (ASF) under one or more
+#   contributor license agreements.  See the NOTICE file distributed with
+#   this work for additional information regarding copyright ownership.
+#   The ASF licenses this file to You under the Apache License, Version 2.0
+#   (the "License"); you may not use this file except in compliance with
+#   the License.  You may obtain a copy of the License at
+#
+#       http://www.apache.org/licenses/LICENSE-2.0
+#
+#   Unless required by applicable law or agreed to in writing, software
+#   distributed under the License is distributed on an "AS IS" BASIS,
+#   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#   See the License for the specific language governing permissions and
+#   limitations under the License.
+#
+
+# syntax: [prefix].[source|sink].[instance].[options]
+# See javadoc of package-info.java for org.apache.hadoop.metrics2 for details
+
+*.sink.file.class=org.apache.hadoop.metrics2.sink.FileSink
+# default sampling period, in seconds
+*.period=10
+
+# The namenode-metrics.out will contain metrics from all context
+#namenode.sink.file.filename=namenode-metrics.out
+# Specifying a special sampling period for namenode:
+#namenode.sink.*.period=8
+
+#datanode.sink.file.filename=datanode-metrics.out
+
+# the following example split metrics of different
+# context to different sinks (in this case files)
+#jobtracker.sink.file_jvm.context=jvm
+#jobtracker.sink.file_jvm.filename=jobtracker-jvm-metrics.out
+#jobtracker.sink.file_mapred.context=mapred
+#jobtracker.sink.file_mapred.filename=jobtracker-mapred-metrics.out
+
+#tasktracker.sink.file.filename=tasktracker-metrics.out
+
+#maptask.sink.file.filename=maptask-metrics.out
+
+#reducetask.sink.file.filename=reducetask-metrics.out
+
--- a/hadoop/roles/common/templates/hadoop_ha_conf/hdfs-site.xml.j2
+++ b/hadoop/roles/common/templates/hadoop_ha_conf/hdfs-site.xml.j2
@ -0,0 +1,103 @@
+<?xml version="1.0"?>
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
+<configuration>
+   <property>
+    <name>dfs.nameservices</name>
+    <value>{{ hadoop['nameservice_id'] }}</value>
+  </property>
+  <property>
+  <name>dfs.ha.namenodes.{{ hadoop['nameservice_id'] }}</name>
+  <value>{{ groups.hadoop_masters | join(',') }}</value>
+</property>
+  <property>
+    <name>dfs.blocksize</name>
+    <value>{{ hadoop['dfs_blocksize'] }}</value>
+  </property>
+  <property>
+    <name>dfs.permissions.superusergroup</name>
+    <value>{{ hadoop['dfs_permissions_superusergroup'] }}</value>
+  </property>
+<property>
+  <name>dfs.ha.automatic-failover.enabled</name>
+  <value>true</value>
+</property>
+<property>
+  <name>ha.zookeeper.quorum</name>
+  <value>{{ groups.zookeeper_servers | join(':' ~ hadoop['zookeeper_clientport'] + ',') }}:{{ hadoop['zookeeper_clientport'] }}</value>
+</property>
+
+{% for host in groups['hadoop_masters'] %}
+  <property>
+    <name>dfs.namenode.rpc-address.{{ hadoop['nameservice_id'] }}.{{ host }}</name>
+    <value>{{ host }}:{{ hadoop['fs_default_FS_port'] }}</value>
+  </property>
+{% endfor %}
+{% for host in groups['hadoop_masters'] %}
+  <property>
+    <name>dfs.namenode.http-address.{{ hadoop['nameservice_id'] }}.{{ host }}</name>
+    <value>{{ host }}:{{ hadoop['dfs_namenode_http_address_port'] }}</value>
+  </property>
+{% endfor %}
+  <property>
+    <name>dfs.namenode.shared.edits.dir</name>
+    <value>qjournal://{{ groups.qjournal_servers | join(':' ~ hadoop['qjournal_port'] + ';') }}:{{ hadoop['qjournal_port'] }}/{{ hadoop['nameservice_id'] }}</value>
+  </property>
+  <property>
+   <name>dfs.journalnode.edits.dir</name>
+   <value>{{ hadoop['dfs_journalnode_edits_dir'] }}</value>
+ </property>  
+  <property>
+   <name>dfs.client.failover.proxy.provider.{{ hadoop['nameservice_id'] }}</name>
+   <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
+ </property>
+ <property>
+   <name>dfs.ha.fencing.methods</name>
+   <value>shell(/bin/true )</value>
+ </property>
+
+<property>
+    <name>dfs.ha.zkfc.port</name>
+    <value>{{ hadoop['dfs_ha_zkfc_port'] }}</value>
+  </property>
+  
+ <property>
+    <name>dfs.datanode.address</name>
+    <value>0.0.0.0:{{ hadoop['dfs_datanode_address_port'] }}</value>
+  </property>
+  <property>
+    <name>dfs.datanode.http.address</name>
+    <value>0.0.0.0:{{ hadoop['dfs_datanode_http_address_port'] }}</value>
+  </property>
+  <property>
+    <name>dfs.datanode.ipc.address</name>
+    <value>0.0.0.0:{{ hadoop['dfs_datanode_ipc_address_port'] }}</value>
+  </property>
+  <property>
+    <name>dfs.replication</name>
+    <value>{{ hadoop['dfs_replication'] }}</value>
+  </property>
+  <property>
+    <name>dfs.namenode.name.dir</name>
+    <value>{{ hadoop['dfs_namenode_name_dir'] | join(',') }}</value>
+  </property>
+  <property>
+    <name>dfs.datanode.data.dir</name>
+    <value>{{ hadoop['dfs_datanode_data_dir'] | join(',') }}</value>
+  </property>
+</configuration>
--- a/hadoop/roles/common/templates/hadoop_ha_conf/log4j.properties.j2
+++ b/hadoop/roles/common/templates/hadoop_ha_conf/log4j.properties.j2
@ -0,0 +1,219 @@
+# Copyright 2011 The Apache Software Foundation
+# 
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Define some default values that can be overridden by system properties
+hadoop.root.logger=INFO,console
+hadoop.log.dir=.
+hadoop.log.file=hadoop.log
+
+# Define the root logger to the system property "hadoop.root.logger".
+log4j.rootLogger=${hadoop.root.logger}, EventCounter
+
+# Logging Threshold
+log4j.threshold=ALL
+
+# Null Appender
+log4j.appender.NullAppender=org.apache.log4j.varia.NullAppender
+
+#
+# Rolling File Appender - cap space usage at 5gb.
+#
+hadoop.log.maxfilesize=256MB
+hadoop.log.maxbackupindex=20
+log4j.appender.RFA=org.apache.log4j.RollingFileAppender
+log4j.appender.RFA.File=${hadoop.log.dir}/${hadoop.log.file}
+
+log4j.appender.RFA.MaxFileSize=${hadoop.log.maxfilesize}
+log4j.appender.RFA.MaxBackupIndex=${hadoop.log.maxbackupindex}
+
+log4j.appender.RFA.layout=org.apache.log4j.PatternLayout
+
+# Pattern format: Date LogLevel LoggerName LogMessage
+log4j.appender.RFA.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n
+# Debugging Pattern format
+#log4j.appender.RFA.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} (%F:%M(%L)) - %m%n
+
+
+#
+# Daily Rolling File Appender
+#
+
+log4j.appender.DRFA=org.apache.log4j.DailyRollingFileAppender
+log4j.appender.DRFA.File=${hadoop.log.dir}/${hadoop.log.file}
+
+# Rollver at midnight
+log4j.appender.DRFA.DatePattern=.yyyy-MM-dd
+
+# 30-day backup
+#log4j.appender.DRFA.MaxBackupIndex=30
+log4j.appender.DRFA.layout=org.apache.log4j.PatternLayout
+
+# Pattern format: Date LogLevel LoggerName LogMessage
+log4j.appender.DRFA.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n
+# Debugging Pattern format
+#log4j.appender.DRFA.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} (%F:%M(%L)) - %m%n
+
+
+#
+# console
+# Add "console" to rootlogger above if you want to use this 
+#
+
+log4j.appender.console=org.apache.log4j.ConsoleAppender
+log4j.appender.console.target=System.err
+log4j.appender.console.layout=org.apache.log4j.PatternLayout
+log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{2}: %m%n
+
+#
+# TaskLog Appender
+#
+
+#Default values
+hadoop.tasklog.taskid=null
+hadoop.tasklog.iscleanup=false
+hadoop.tasklog.noKeepSplits=4
+hadoop.tasklog.totalLogFileSize=100
+hadoop.tasklog.purgeLogSplits=true
+hadoop.tasklog.logsRetainHours=12
+
+log4j.appender.TLA=org.apache.hadoop.mapred.TaskLogAppender
+log4j.appender.TLA.taskId=${hadoop.tasklog.taskid}
+log4j.appender.TLA.isCleanup=${hadoop.tasklog.iscleanup}
+log4j.appender.TLA.totalLogFileSize=${hadoop.tasklog.totalLogFileSize}
+
+log4j.appender.TLA.layout=org.apache.log4j.PatternLayout
+log4j.appender.TLA.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n
+
+#
+# HDFS block state change log from block manager
+#
+# Uncomment the following to suppress normal block state change
+# messages from BlockManager in NameNode.
+#log4j.logger.BlockStateChange=WARN
+
+#
+#Security appender
+#
+hadoop.security.logger=INFO,NullAppender
+hadoop.security.log.maxfilesize=256MB
+hadoop.security.log.maxbackupindex=20
+log4j.category.SecurityLogger=${hadoop.security.logger}
+hadoop.security.log.file=SecurityAuth-${user.name}.audit
+log4j.appender.RFAS=org.apache.log4j.RollingFileAppender 
+log4j.appender.RFAS.File=${hadoop.log.dir}/${hadoop.security.log.file}
+log4j.appender.RFAS.layout=org.apache.log4j.PatternLayout
+log4j.appender.RFAS.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n
+log4j.appender.RFAS.MaxFileSize=${hadoop.security.log.maxfilesize}
+log4j.appender.RFAS.MaxBackupIndex=${hadoop.security.log.maxbackupindex}
+
+#
+# Daily Rolling Security appender
+#
+log4j.appender.DRFAS=org.apache.log4j.DailyRollingFileAppender 
+log4j.appender.DRFAS.File=${hadoop.log.dir}/${hadoop.security.log.file}
+log4j.appender.DRFAS.layout=org.apache.log4j.PatternLayout
+log4j.appender.DRFAS.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n
+log4j.appender.DRFAS.DatePattern=.yyyy-MM-dd
+
+#
+# hdfs audit logging
+#
+hdfs.audit.logger=INFO,NullAppender
+hdfs.audit.log.maxfilesize=256MB
+hdfs.audit.log.maxbackupindex=20
+log4j.logger.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=${hdfs.audit.logger}
+log4j.additivity.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=false
+log4j.appender.RFAAUDIT=org.apache.log4j.RollingFileAppender
+log4j.appender.RFAAUDIT.File=${hadoop.log.dir}/hdfs-audit.log
+log4j.appender.RFAAUDIT.layout=org.apache.log4j.PatternLayout
+log4j.appender.RFAAUDIT.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n
+log4j.appender.RFAAUDIT.MaxFileSize=${hdfs.audit.log.maxfilesize}
+log4j.appender.RFAAUDIT.MaxBackupIndex=${hdfs.audit.log.maxbackupindex}
+
+#
+# mapred audit logging
+#
+mapred.audit.logger=INFO,NullAppender
+mapred.audit.log.maxfilesize=256MB
+mapred.audit.log.maxbackupindex=20
+log4j.logger.org.apache.hadoop.mapred.AuditLogger=${mapred.audit.logger}
+log4j.additivity.org.apache.hadoop.mapred.AuditLogger=false
+log4j.appender.MRAUDIT=org.apache.log4j.RollingFileAppender
+log4j.appender.MRAUDIT.File=${hadoop.log.dir}/mapred-audit.log
+log4j.appender.MRAUDIT.layout=org.apache.log4j.PatternLayout
+log4j.appender.MRAUDIT.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n
+log4j.appender.MRAUDIT.MaxFileSize=${mapred.audit.log.maxfilesize}
+log4j.appender.MRAUDIT.MaxBackupIndex=${mapred.audit.log.maxbackupindex}
+
+# Custom Logging levels
+
+#log4j.logger.org.apache.hadoop.mapred.JobTracker=DEBUG
+#log4j.logger.org.apache.hadoop.mapred.TaskTracker=DEBUG
+#log4j.logger.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=DEBUG
+
+# Jets3t library
+log4j.logger.org.jets3t.service.impl.rest.httpclient.RestS3Service=ERROR
+
+#
+# Event Counter Appender
+# Sends counts of logging messages at different severity levels to Hadoop Metrics.
+#
+log4j.appender.EventCounter=org.apache.hadoop.log.metrics.EventCounter
+
+#
+# Job Summary Appender 
+#
+# Use following logger to send summary to separate file defined by 
+# hadoop.mapreduce.jobsummary.log.file :
+# hadoop.mapreduce.jobsummary.logger=INFO,JSA
+# 
+hadoop.mapreduce.jobsummary.logger=${hadoop.root.logger}
+hadoop.mapreduce.jobsummary.log.file=hadoop-mapreduce.jobsummary.log
+hadoop.mapreduce.jobsummary.log.maxfilesize=256MB
+hadoop.mapreduce.jobsummary.log.maxbackupindex=20
+log4j.appender.JSA=org.apache.log4j.RollingFileAppender
+log4j.appender.JSA.File=${hadoop.log.dir}/${hadoop.mapreduce.jobsummary.log.file}
+log4j.appender.JSA.MaxFileSize=${hadoop.mapreduce.jobsummary.log.maxfilesize}
+log4j.appender.JSA.MaxBackupIndex=${hadoop.mapreduce.jobsummary.log.maxbackupindex}
+log4j.appender.JSA.layout=org.apache.log4j.PatternLayout
+log4j.appender.JSA.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{2}: %m%n
+log4j.logger.org.apache.hadoop.mapred.JobInProgress$JobSummary=${hadoop.mapreduce.jobsummary.logger}
+log4j.additivity.org.apache.hadoop.mapred.JobInProgress$JobSummary=false
+
+#
+# Yarn ResourceManager Application Summary Log 
+#
+# Set the ResourceManager summary log filename
+#yarn.server.resourcemanager.appsummary.log.file=rm-appsummary.log
+# Set the ResourceManager summary log level and appender
+#yarn.server.resourcemanager.appsummary.logger=INFO,RMSUMMARY
+
+# Appender for ResourceManager Application Summary Log
+# Requires the following properties to be set
+#    - hadoop.log.dir (Hadoop Log directory)
+#    - yarn.server.resourcemanager.appsummary.log.file (resource manager app summary log filename)
+#    - yarn.server.resourcemanager.appsummary.logger (resource manager app summary log level and appender)
+
+#log4j.logger.org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary=${yarn.server.resourcemanager.appsummary.logger}
+#log4j.additivity.org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary=false
+#log4j.appender.RMSUMMARY=org.apache.log4j.RollingFileAppender
+#log4j.appender.RMSUMMARY.File=${hadoop.log.dir}/${yarn.server.resourcemanager.appsummary.log.file}
+#log4j.appender.RMSUMMARY.MaxFileSize=256MB
+#log4j.appender.RMSUMMARY.MaxBackupIndex=20
+#log4j.appender.RMSUMMARY.layout=org.apache.log4j.PatternLayout
+#log4j.appender.RMSUMMARY.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n
--- a/hadoop/roles/common/templates/hadoop_ha_conf/mapred-site.xml.j2
+++ b/hadoop/roles/common/templates/hadoop_ha_conf/mapred-site.xml.j2
@ -0,0 +1,120 @@
+<configuration>
+  
+  <property>
+    <name>mapred.job.tracker</name>
+    <value>{{ hadoop['mapred_job_tracker_ha_servicename'] }}</value>
+  </property>
+
+   <property>
+    <name>mapred.jobtrackers.{{ hadoop['mapred_job_tracker_ha_servicename'] }}</name>
+    <value>{{ groups['hadoop_masters'] | join(',') }}</value>
+    <description>Comma-separated list of JobTracker IDs.</description>
+  </property>
+ 
+ <property>
+    <name>mapred.ha.automatic-failover.enabled</name>
+    <value>true</value>
+  </property>
+
+  <property>
+    <name>mapred.ha.zkfc.port</name>
+    <value>{{ hadoop['mapred_ha_zkfc_port'] }}</value> 
+  </property>
+
+ <property>
+    <name>mapred.ha.fencing.methods</name>
+    <value>shell(/bin/true)</value>
+ </property>
+
+<property>
+  <name>ha.zookeeper.quorum</name>
+  <value>{{ groups.zookeeper_servers | join(':' ~ hadoop['zookeeper_clientport'] + ',') }}:{{ hadoop['zookeeper_clientport'] }}</value>
+</property>
+
+{% for host in groups['hadoop_masters'] %} 
+   <property>
+    <name>mapred.jobtracker.rpc-address.{{ hadoop['mapred_job_tracker_ha_servicename'] }}.{{ host }}</name> 
+    <value>{{ host }}:{{ hadoop['mapred_job_tracker_port'] }}</value>
+   </property>
+{% endfor %}
+{% for host in groups['hadoop_masters'] %} 
+ <property>
+    <name>mapred.job.tracker.http.address.{{ hadoop['mapred_job_tracker_ha_servicename'] }}.{{ host }}</name>
+    <value>0.0.0.0:{{ hadoop['mapred_job_tracker_http_address_port'] }}</value>
+  </property>
+{% endfor %}
+{% for host in groups['hadoop_masters'] %} 
+   <property>
+    <name>mapred.ha.jobtracker.rpc-address.{{ hadoop['mapred_job_tracker_ha_servicename'] }}.{{ host }}</name> 
+    <value>{{ host }}:{{ hadoop['mapred_ha_jobtracker_rpc-address_port'] }}</value>
+   </property>
+{% endfor %}
+{% for host in groups['hadoop_masters'] %} 
+ <property>
+    <name>mapred.ha.jobtracker.http-redirect-address.{{ hadoop['mapred_job_tracker_ha_servicename'] }}.{{ host }}</name>
+    <value>{{ host }}:{{ hadoop['mapred_job_tracker_http_address_port'] }}</value>
+  </property>
+{% endfor %}
+
+ <property>
+    <name>mapred.jobtracker.restart.recover</name>
+    <value>true</value>
+  </property>
+
+  <property>
+    <name>mapred.job.tracker.persist.jobstatus.active</name>
+    <value>true</value>
+  </property>
+
+ <property>
+    <name>mapred.job.tracker.persist.jobstatus.hours</name>
+    <value>1</value>
+  </property>
+
+  <property>
+    <name>mapred.job.tracker.persist.jobstatus.dir</name>
+    <value>{{ hadoop['mapred_job_tracker_persist_jobstatus_dir'] }}</value>
+  </property>
+
+  <property>
+    <name>mapred.client.failover.proxy.provider.{{ hadoop['mapred_job_tracker_ha_servicename'] }}</name>
+    <value>org.apache.hadoop.mapred.ConfiguredFailoverProxyProvider</value>
+  </property>
+
+  <property>
+    <name>mapred.client.failover.max.attempts</name>
+    <value>15</value>
+  </property>
+
+  <property>
+    <name>mapred.client.failover.sleep.base.millis</name>
+    <value>500</value>
+  </property>
+
+  <property>
+    <name>mapred.client.failover.sleep.max.millis</name>
+    <value>1500</value>  
+  </property>
+
+ <property>
+    <name>mapred.client.failover.connection.retries</name>
+    <value>0</value>  
+  </property>
+
+ <property>
+    <name>mapred.client.failover.connection.retries.on.timeouts</name>
+    <value>0</value>  
+  </property>
+
+
+  <property>
+    <name>mapred.local.dir</name>
+    <value>{{ hadoop["mapred_local_dir"] | join(',') }}</value>
+  </property>
+  
+  <property>
+    <name>mapred.task.tracker.http.address</name>
+    <value>0.0.0.0:{{ hadoop['mapred_task_tracker_http_address_port'] }}</value>
+  </property>
+  
+</configuration>
--- a/hadoop/roles/common/templates/hadoop_ha_conf/slaves.j2
+++ b/hadoop/roles/common/templates/hadoop_ha_conf/slaves.j2
@ -0,0 +1,3 @@
+{% for host in groups['hadoop_slaves'] %}
+{{ host }}
+{% endfor %}
--- a/hadoop/roles/common/templates/hadoop_ha_conf/ssl-client.xml.example.j2
+++ b/hadoop/roles/common/templates/hadoop_ha_conf/ssl-client.xml.example.j2
@ -0,0 +1,80 @@
+<?xml version="1.0"?>
+<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+<configuration>
+
+<property>
+  <name>ssl.client.truststore.location</name>
+  <value></value>
+  <description>Truststore to be used by clients like distcp. Must be
+  specified.
+  </description>
+</property>
+
+<property>
+  <name>ssl.client.truststore.password</name>
+  <value></value>
+  <description>Optional. Default value is "".
+  </description>
+</property>
+
+<property>
+  <name>ssl.client.truststore.type</name>
+  <value>jks</value>
+  <description>Optional. The keystore file format, default value is "jks".
+  </description>
+</property>
+
+<property>
+  <name>ssl.client.truststore.reload.interval</name>
+  <value>10000</value>
+  <description>Truststore reload check interval, in milliseconds.
+  Default value is 10000 (10 seconds).
+  </description>
+</property>
+
+<property>
+  <name>ssl.client.keystore.location</name>
+  <value></value>
+  <description>Keystore to be used by clients like distcp. Must be
+  specified.
+  </description>
+</property>
+
+<property>
+  <name>ssl.client.keystore.password</name>
+  <value></value>
+  <description>Optional. Default value is "".
+  </description>
+</property>
+
+<property>
+  <name>ssl.client.keystore.keypassword</name>
+  <value></value>
+  <description>Optional. Default value is "".
+  </description>
+</property>
+
+<property>
+  <name>ssl.client.keystore.type</name>
+  <value>jks</value>
+  <description>Optional. The keystore file format, default value is "jks".
+  </description>
+</property>
+
+</configuration>
--- a/hadoop/roles/common/templates/hadoop_ha_conf/ssl-server.xml.example.j2
+++ b/hadoop/roles/common/templates/hadoop_ha_conf/ssl-server.xml.example.j2
@ -0,0 +1,77 @@
+<?xml version="1.0"?>
+<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+<configuration>
+
+<property>
+  <name>ssl.server.truststore.location</name>
+  <value></value>
+  <description>Truststore to be used by NN and DN. Must be specified.
+  </description>
+</property>
+
+<property>
+  <name>ssl.server.truststore.password</name>
+  <value></value>
+  <description>Optional. Default value is "".
+  </description>
+</property>
+
+<property>
+  <name>ssl.server.truststore.type</name>
+  <value>jks</value>
+  <description>Optional. The keystore file format, default value is "jks".
+  </description>
+</property>
+
+<property>
+  <name>ssl.server.truststore.reload.interval</name>
+  <value>10000</value>
+  <description>Truststore reload check interval, in milliseconds.
+  Default value is 10000 (10 seconds).
+</property>
+
+<property>
+  <name>ssl.server.keystore.location</name>
+  <value></value>
+  <description>Keystore to be used by NN and DN. Must be specified.
+  </description>
+</property>
+
+<property>
+  <name>ssl.server.keystore.password</name>
+  <value></value>
+  <description>Must be specified.
+  </description>
+</property>
+
+<property>
+  <name>ssl.server.keystore.keypassword</name>
+  <value></value>
+  <description>Must be specified.
+  </description>
+</property>
+
+<property>
+  <name>ssl.server.keystore.type</name>
+  <value>jks</value>
+  <description>Optional. The keystore file format, default value is "jks".
+  </description>
+</property>
+
+</configuration>
--- a/hadoop/roles/common/templates/iptables.j2
+++ b/hadoop/roles/common/templates/iptables.j2
@ -0,0 +1,41 @@
+# Firewall configuration written by system-config-firewall
+# Manual customization of this file is not recommended_
+*filter
+:INPUT ACCEPT [0:0]
+:FORWARD ACCEPT [0:0]
+:OUTPUT ACCEPT [0:0]
+{% if 'hadoop_masters' in group_names %}
+-A INPUT -p tcp  --dport {{ hadoop['fs_default_FS_port'] }} -j  ACCEPT
+-A INPUT -p tcp  --dport {{ hadoop['dfs_namenode_http_address_port'] }} -j  ACCEPT
+-A INPUT -p tcp  --dport {{ hadoop['mapred_job_tracker_port'] }} -j  ACCEPT
+-A INPUT -p tcp  --dport {{ hadoop['mapred_job_tracker_http_address_port'] }} -j  ACCEPT
+-A INPUT -p tcp  --dport {{ hadoop['mapred_ha_jobtracker_rpc-address_port'] }} -j  ACCEPT
+-A INPUT -p tcp  --dport {{ hadoop['mapred_ha_zkfc_port'] }} -j  ACCEPT
+-A INPUT -p tcp  --dport {{ hadoop['dfs_ha_zkfc_port'] }} -j  ACCEPT
+
+{% endif %}
+{% if 'hadoop_slaves' in group_names %}
+-A INPUT -p tcp  --dport {{ hadoop['dfs_datanode_address_port'] }} -j  ACCEPT
+-A INPUT -p tcp  --dport {{ hadoop['dfs_datanode_http_address_port'] }} -j  ACCEPT
+-A INPUT -p tcp  --dport {{ hadoop['dfs_datanode_ipc_address_port'] }} -j  ACCEPT
+-A INPUT -p tcp  --dport {{ hadoop['mapred_task_tracker_http_address_port'] }} -j  ACCEPT
+{% endif %}
+{% if 'qjournal_servers' in group_names %}
+-A INPUT -p tcp  --dport {{ hadoop['qjournal_port'] }} -j  ACCEPT
+-A INPUT -p tcp  --dport {{ hadoop['qjournal_http_port'] }} -j  ACCEPT
+{% endif %}
+{% if 'zookeeper_servers' in group_names %}
+-A INPUT -p tcp  --dport {{ hadoop['zookeeper_clientport'] }} -j  ACCEPT
+-A INPUT -p tcp  --dport {{ hadoop['zookeeper_leader_port'] }} -j  ACCEPT
+-A INPUT -p tcp  --dport {{ hadoop['zookeeper_election_port'] }} -j  ACCEPT
+{% endif %}
+-A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
+-A INPUT -p icmp -j ACCEPT
+-A INPUT -i lo -j ACCEPT
+-A INPUT -m state --state NEW -m tcp -p tcp --dport 22 -j ACCEPT
+-A INPUT -j REJECT --reject-with icmp-host-prohibited
+-A FORWARD -j REJECT --reject-with icmp-host-prohibited
+COMMIT
+
+
+
--- a/hadoop/roles/hadoop_primary/handlers/main.yml
+++ b/hadoop/roles/hadoop_primary/handlers/main.yml
@ -0,0 +1,14 @@
+---
+# Handlers for the hadoop master services
+
+- name: restart hadoop master services
+  service: name=${item} state=restarted
+  with_items: 
+    - hadoop-0.20-mapreduce-jobtracker
+    - hadoop-hdfs-namenode
+
+- name: restart hadoopha master services
+  service: name=${item} state=restarted
+  with_items: 
+    - hadoop-0.20-mapreduce-jobtrackerha
+    - hadoop-hdfs-namenode
--- a/hadoop/roles/hadoop_primary/tasks/hadoop_master.yml
+++ b/hadoop/roles/hadoop_primary/tasks/hadoop_master.yml
@ -0,0 +1,38 @@
+---
+# Playbook for  Hadoop master servers
+
+- name: Install the namenode and jobtracker packages
+  yum: name={{ item }} state=installed
+  with_items: 
+   - hadoop-0.20-mapreduce-jobtrackerha
+   - hadoop-hdfs-namenode
+   - hadoop-hdfs-zkfc
+   - hadoop-0.20-mapreduce-zkfc
+
+- name: Copy the hadoop configuration files
+  template: src=roles/common/templates/hadoop_ha_conf/{{ item }}.j2 dest=/etc/hadoop/conf/{{ item }}
+  with_items: 
+   - core-site.xml
+   - hadoop-metrics.properties
+   - hadoop-metrics2.properties
+   - hdfs-site.xml
+   - log4j.properties
+   - mapred-site.xml
+   - slaves
+   - ssl-client.xml.example
+   - ssl-server.xml.example
+  notify: restart hadoopha master services
+
+- name: Create the data directory for the namenode metadata
+  file: path={{ item }} owner=hdfs group=hdfs state=directory
+  with_items: hadoop.dfs_namenode_name_dir 
+
+- name: Create the data directory for the jobtracker ha
+  file: path={{ item }} owner=mapred group=mapred state=directory
+  with_items: hadoop.mapred_job_tracker_persist_jobstatus_dir
+
+- name: Format the namenode
+  shell: creates=/usr/lib/hadoop/namenode.formatted su - hdfs -c "hadoop namenode -format"; touch /usr/lib/hadoop/namenode.formatted
+
+- name: start hadoop namenode services
+  service: name=hadoop-hdfs-namenode state=started
--- a/hadoop/roles/hadoop_primary/tasks/hadoop_master_no_ha.yml
+++ b/hadoop/roles/hadoop_primary/tasks/hadoop_master_no_ha.yml
@ -0,0 +1,38 @@
+---
+# Playbook for  Hadoop master servers
+
+- name: Install the namenode and jobtracker packages
+  yum: name={{ item }} state=installed
+  with_items: 
+   - hadoop-0.20-mapreduce-jobtracker
+   - hadoop-hdfs-namenode
+
+- name: Copy the hadoop configuration files for no ha
+  template: src=roles/common/templates/hadoop_conf/{{ item }}.j2 dest=/etc/hadoop/conf/{{ item }}
+  with_items: 
+   - core-site.xml
+   - hadoop-metrics.properties
+   - hadoop-metrics2.properties
+   - hdfs-site.xml
+   - log4j.properties
+   - mapred-site.xml
+   - slaves
+   - ssl-client.xml.example
+   - ssl-server.xml.example
+  notify: restart hadoop master services
+
+- name: Create the data directory for the namenode metadata
+  file: path={{ item }} owner=hdfs group=hdfs state=directory
+  with_items: hadoop.dfs_namenode_name_dir
+
+- name: Format the namenode
+  shell: creates=/usr/lib/hadoop/namenode.formatted su - hdfs -c "hadoop namenode -format"; touch /usr/lib/hadoop/namenode.formatted
+
+- name: start hadoop namenode services
+  service: name=hadoop-hdfs-namenode state=started
+
+- name: Give permissions for mapred users
+  shell: creates=/usr/lib/hadoop/fs.initialized su - hdfs -c "hadoop fs -chown hdfs:hadoop /"; su - hdfs -c "hadoop fs -chmod 0774 /"; touch /usr/lib/hadoop/namenode.initialized
+
+- name: start hadoop jobtracker services
+  service: name=hadoop-0.20-mapreduce-jobtracker state=started
--- a/hadoop/roles/hadoop_primary/tasks/main.yml
+++ b/hadoop/roles/hadoop_primary/tasks/main.yml
@ -0,0 +1,9 @@
+---
+# Playbook for Hadoop master primary servers
+
+- include: hadoop_master.yml 
+  when: ha_enabled
+
+- include: hadoop_master_no_ha.yml 
+  when: not ha_enabled
+
--- a/hadoop/roles/hadoop_secondary/handlers/main.yml
+++ b/hadoop/roles/hadoop_secondary/handlers/main.yml
@ -0,0 +1,14 @@
+---
+# Handlers for the hadoop master services
+
+- name: restart hadoop master services
+  service: name=${item} state=restarted
+  with_items: 
+    - hadoop-0.20-mapreduce-jobtracker
+    - hadoop-hdfs-namenode
+
+- name: restart hadoopha master services
+  service: name=${item} state=restarted
+  with_items: 
+    - hadoop-0.20-mapreduce-jobtrackerha
+    - hadoop-hdfs-namenode
--- a/hadoop/roles/hadoop_secondary/tasks/main.yml
+++ b/hadoop/roles/hadoop_secondary/tasks/main.yml
@ -0,0 +1,64 @@
+---
+# Playbook for  Hadoop master secondary server
+
+
+- name: Install the namenode and jobtracker packages
+  yum: name=${item} state=installed
+  with_items: 
+   - hadoop-0.20-mapreduce-jobtrackerha
+   - hadoop-hdfs-namenode
+   - hadoop-hdfs-zkfc
+   - hadoop-0.20-mapreduce-zkfc
+
+- name: Copy the hadoop configuration files
+  template: src=roles/common/templates/hadoop_ha_conf/{{ item }}.j2 dest=/etc/hadoop/conf/{{ item }}
+  with_items: 
+   - core-site.xml
+   - hadoop-metrics.properties
+   - hadoop-metrics2.properties
+   - hdfs-site.xml
+   - log4j.properties
+   - mapred-site.xml
+   - slaves
+   - ssl-client.xml.example
+   - ssl-server.xml.example
+  notify: restart hadoopha master services
+
+- name: Create the data directory for the namenode metadata
+  file: path={{ item }} owner=hdfs group=hdfs state=directory
+  with_items: hadoop.dfs_namenode_name_dir
+
+- name: Create the data directory for the jobtracker ha
+  file: path={{ item }} owner=mapred group=mapred state=directory
+  with_items: hadoop.mapred_job_tracker_persist_jobstatus_dir
+
+
+- name: Initialize the secodary namenode
+  shell: creates=/usr/lib/hadoop/namenode.formatted su - hdfs -c "hadoop namenode -bootstrapStandby"; touch /usr/lib/hadoop/namenode.formatted
+
+- name: start hadoop namenode services
+  service: name=hadoop-hdfs-namenode state=started
+
+- name: Initialize the zkfc for namenode
+  shell: creates=/usr/lib/hadoop/zkfc.formatted su - hdfs -c "hdfs zkfc -formatZK"; touch /usr/lib/hadoop/zkfc.formatted
+
+- name: start zkfc for namenodes
+  service: name=hadoop-hdfs-zkfc state=started
+  delegate_to: ${item}
+  with_items: groups.hadoop_masters
+
+- name: Give permissions for mapred users
+  shell: creates=/usr/lib/hadoop/fs.initialized su - hdfs -c "hadoop fs -chown hdfs:hadoop /"; su - hdfs -c "hadoop fs -chmod 0774 /"; touch /usr/lib/hadoop/namenode.initialized
+
+- name: Initialize the zkfc for jobtracker
+  shell: creates=/usr/lib/hadoop/zkfcjob.formatted su - mapred -c "hadoop mrzkfc -formatZK"; touch /usr/lib/hadoop/zkfcjob.formatted
+
+- name: start zkfc for jobtracker 
+  service: name=hadoop-0.20-mapreduce-zkfc state=started
+  delegate_to: '{{ item }}'
+  with_items: groups.hadoop_masters
+
+- name: start hadoop Jobtracker services
+  service: name=hadoop-0.20-mapreduce-jobtrackerha state=started
+  delegate_to: '{{ item }}'
+  with_items: groups.hadoop_masters
--- a/hadoop/roles/hadoop_slaves/handlers/main.yml
+++ b/hadoop/roles/hadoop_slaves/handlers/main.yml
@ -0,0 +1,8 @@
+---
+# Handlers for the hadoop slave services
+
+- name: restart hadoop slave services
+  service: name=${item} state=restarted
+  with_items: 
+    - hadoop-0.20-mapreduce-tasktracker
+    - hadoop-hdfs-datanode
--- a/hadoop/roles/hadoop_slaves/tasks/main.yml
+++ b/hadoop/roles/hadoop_slaves/tasks/main.yml
@ -0,0 +1,4 @@
+---
+# Playbook for  Hadoop slave servers
+
+- include: slaves.yml tags=slaves
--- a/hadoop/roles/hadoop_slaves/tasks/slaves.yml
+++ b/hadoop/roles/hadoop_slaves/tasks/slaves.yml
@ -0,0 +1,53 @@
+---
+# Playbook for  Hadoop slave servers
+
+- name: Install the datanode and tasktracker packages
+  yum: name=${item} state=installed
+  with_items: 
+   - hadoop-0.20-mapreduce-tasktracker
+   - hadoop-hdfs-datanode
+
+- name: Copy the hadoop configuration files
+  template: src=roles/common/templates/hadoop_ha_conf/${item}.j2 dest=/etc/hadoop/conf/${item}
+  with_items: 
+   - core-site.xml
+   - hadoop-metrics.properties
+   - hadoop-metrics2.properties
+   - hdfs-site.xml
+   - log4j.properties
+   - mapred-site.xml
+   - slaves
+   - ssl-client.xml.example
+   - ssl-server.xml.example
+  when: ha_enabled
+  notify: restart hadoop slave services
+
+- name: Copy the hadoop configuration files for non ha
+  template: src=roles/common/templates/hadoop_conf/${item}.j2 dest=/etc/hadoop/conf/${item}
+  with_items: 
+   - core-site.xml
+   - hadoop-metrics.properties
+   - hadoop-metrics2.properties
+   - hdfs-site.xml
+   - log4j.properties
+   - mapred-site.xml
+   - slaves
+   - ssl-client.xml.example
+   - ssl-server.xml.example
+  when: not ha_enabled
+  notify: restart hadoop slave services
+
+- name: Create the data directory for the slave nodes to store the data
+  file: path={{ item }} owner=hdfs group=hdfs state=directory
+  with_items: hadoop.dfs_datanode_data_dir
+
+- name: Create the data directory for the slave nodes for mapreduce
+  file: path={{ item }} owner=mapred group=mapred state=directory
+  with_items: hadoop.mapred_local_dir
+
+- name: start hadoop slave services
+  service: name={{ item }} state=started
+  with_items:
+    - hadoop-0.20-mapreduce-tasktracker
+    - hadoop-hdfs-datanode
+  
--- a/hadoop/roles/qjournal_servers/handlers/main.yml
+++ b/hadoop/roles/qjournal_servers/handlers/main.yml
@ -0,0 +1,5 @@
+---
+# The journal node handlers
+
+- name: restart qjournal services
+  service: name=hadoop-hdfs-journalnode state=restarted
--- a/hadoop/roles/qjournal_servers/tasks/main.yml
+++ b/hadoop/roles/qjournal_servers/tasks/main.yml
@ -0,0 +1,22 @@
+---
+# Playbook for the qjournal nodes
+
+- name: Install the qjournal package
+  yum: name=hadoop-hdfs-journalnode state=installed
+  
+- name: Create folder for Journaling
+  file: path={{ hadoop.dfs_journalnode_edits_dir }} state=directory owner=hdfs group=hdfs
+
+- name: Copy the hadoop configuration files
+  template: src=roles/common/templates/hadoop_ha_conf/{{ item }}.j2 dest=/etc/hadoop/conf/{{ item }}
+  with_items:
+   - core-site.xml
+   - hadoop-metrics.properties
+   - hadoop-metrics2.properties
+   - hdfs-site.xml
+   - log4j.properties
+   - mapred-site.xml
+   - slaves
+   - ssl-client.xml.example
+   - ssl-server.xml.example
+  notify: restart qjournal services
--- a/hadoop/roles/zookeeper_servers/handlers/main.yml
+++ b/hadoop/roles/zookeeper_servers/handlers/main.yml
@ -0,0 +1,5 @@
+---
+# Handler for the zookeeper services
+
+- name: restart zookeeper
+  service: name=zookeeper-server state=restarted
--- a/hadoop/roles/zookeeper_servers/tasks/main.yml
+++ b/hadoop/roles/zookeeper_servers/tasks/main.yml
@ -0,0 +1,13 @@
+---
+# The plays for zookeper daemons
+
+- name: Install the zookeeper files
+  yum: name=zookeeper-server state=installed
+
+- name: Copy the configuration file for zookeeper
+  template: src=zoo.cfg.j2 dest=/etc/zookeeper/conf/zoo.cfg
+  notify: restart zookeeper
+
+- name: initialize the zookeper
+  shell: creates=/var/lib/zookeeper/myid service zookeeper-server init --myid=${zoo_id}
+ 
--- a/hadoop/roles/zookeeper_servers/templates/zoo.cfg.j2
+++ b/hadoop/roles/zookeeper_servers/templates/zoo.cfg.j2
@ -0,0 +1,9 @@
+tickTime=2000
+dataDir=/var/lib/zookeeper/
+clientPort={{ hadoop['zookeeper_clientport'] }}
+initLimit=5
+syncLimit=2
+{% for host in groups['zookeeper_servers'] %}
+server.{{ hostvars[host].zoo_id }}={{ host }}:{{ hadoop['zookeeper_leader_port'] }}:{{ hadoop['zookeeper_election_port'] }}
+{% endfor %}
+
--- a/hadoop/site.yml
+++ b/hadoop/site.yml
@ -0,0 +1,28 @@
+---
+# The main playbook to deploy the site
+
+- hosts: hadoop_all
+  roles:
+   - common
+
+- hosts: zookeeper_servers
+  roles:
+  - { role: zookeeper_servers, when: ha_enabled }
+
+- hosts: qjournal_servers
+  roles:
+  - { role: qjournal_servers, when: ha_enabled }
+
+- hosts: hadoop_master_primary
+  roles:
+  - { role: hadoop_primary }
+
+- hosts: hadoop_master_secondary
+  roles:
+  - { role: hadoop_secondary, when: ha_enabled }
+
+- hosts: hadoop_slaves
+  roles:
+  - { role: hadoop_slaves }
+
+