Also trim the hadoop example since we aren't maintaining this live anymore. See galaxy.ansible.com for roles content.

10 years ago · 4663d2e789
parent e564ed51b5
commit 4663d2e789
52 changed files with 0 additions and 2251 deletions
--- a/hadoop/LICENSE.md
+++ b/hadoop/LICENSE.md
@ -1,4 +0,0 @@
-Copyright (C) 2013 AnsibleWorks, Inc.
-
-This work is licensed under the Creative Commons Attribution 3.0 Unported License. 
-To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0/deed.en_US. 
--- a/hadoop/README.md
+++ b/hadoop/README.md
@ -1,363 +0,0 @@
-# Deploying Hadoop Clusters using Ansible
-
-## Preface
-
-The playbooks in this example are designed to deploy a Hadoop cluster on a
-CentOS 6 or RHEL 6 environment using Ansible. The playbooks can:
-
-1) Deploy a fully functional Hadoop cluster with HA and automatic failover.
-
-2) Deploy a fully functional Hadoop cluster with no HA.
-
-3) Deploy additional nodes to scale the cluster
-
-These playbooks require Ansible 1.2, CentOS 6 or RHEL 6 target machines, and install
-the open-source Cloudera Hadoop Distribution (CDH) version 4.
-
-## Hadoop Components
-
-Hadoop is framework that allows processing of large datasets across large
-clusters. The two main components that make up a Hadoop cluster are the HDFS
-Filesystem and the MapReduce framework. Briefly, the HDFS filesystem is responsible 
-for storing data across the cluster nodes on its local disks. The MapReduce
-jobs are the tasks that would run on these nodes to get a meaningful result
-using the data stored on the HDFS filesystem.
-  
-Let's have a closer look at each of these components.
-
-## HDFS
-
-![Alt text](images/hdfs.png "HDFS")
-
-The above diagram illustrates an HDFS filesystem. The cluster consists of three
-DataNodes which are responsible for storing/replicating data, while the NameNode
-is a process which is responsible for storing the metadata for the entire
-filesystem. As the example illustrates above, when a client wants to write a
-file to the HDFS cluster it first contacts the NameNode and lets it know that
-it want to write a file. The NameNode then decides where and how the file
-should be saved and notifies the client about its decision.
-
-In the given example "File1" has a size of 128MB and the block size of the HDFS
-filesystem is 64 MB. Hence, the NameNode instructs the client to break down the
-file into two different blocks and write the first block to DataNode 1 and the
-second block to DataNode 2. Upon receiving the notification from the NameNode,
-the client contacts DataNode 1 and DataNode 2 directly to write the data.
-
-Once the data is recieved by the DataNodes, they replicate the block across the
-other nodes. The number of nodes across which the data would be replicated is
-based on the HDFS configuration, the default value being 3. Meanwhile the
-NameNode updates its metadata with the entry of the new file "File1" and the
-locations where the parts are stored.
-
-## MapReduce
-
-MapReduce is a Java application that utilizes the data stored in the
-HDFS filesystem to get some useful and meaningful result. The whole job is
-split into two parts: the "Map" job and the "Reduce" Job.
-
-Let's consider an example. In the previous step we had uploaded the "File1"
-into the HDFS filesystem and the file was broken down into two different
-blocks.  Let's assume that the first block had the data "black sheep" in it and
-the second block has the data "white sheep" in it. Now let's assume a client
-wants to get count of all the words occurring in "File1". In order to get the
-count, the first thing the client would have to do is write a "Map" program
-then a "Reduce" program.
-
-Here's a psudeo code of how the Map and Reduce jobs might look:
-
-		mapper (File1, file-contents):
-		for each word in file-contents:
-		emit (word, 1)
-
-		reducer (word, values):
-		sum = 0
-		for each value in values:
-		  sum = sum + value
-		emit (word, sum)
-
-The work of the Map job is to go through all the words in the file and emit
-a key/value pair. In this case the key is the word itself and value is always
-1.
-
-The Reduce job is quite simple: it increments the value of sum by 1, for each
-value it gets.
-
-Once the Map and Reduce jobs are ready, the client would instruct the
-"JobTracker" (the process resposible for scheduling the jobs on the cluster)
-to run the MapReduce job on "File1"
-		
-Let's have closer look at the anotomy of a Map job.
-    
-
-![Alt text](images/map.png "Map job")
-
-As the figure above shows, when the client instructs the JobTracker to run a
-job on File1, the JobTracker first contacts the NameNode to determine where the
-blocks of the File1 are. Then the JobTracker sends the Map job's JAR file down
-to the nodes having the blocks, and the TaskTracker process those nodes to run
-the application.
-
-In the above example, DataNode 1 and DataNode 2 havw the blocks, so the
-TaskTrackers on those nodes run the Map jobs. Once the jobs are completed the
-two nodes would have key/value results as below:
-
-MapJob Results:
-
-		TaskTracker1:
-		"Black: 1"
-		"Sheep: 1"
-
-		TaskTracker2:
-		"White: 1"
-		"Sheep: 1"
-
-
-Once the Map phase is completed the JobTracker process initiates the Shuffle
-and Reduce process.
-
-Let's have closer look at the Shuffle-Reduce job.
-
-![Alt text](images/reduce.png "Reduce job")
-
-As the figure above demonstrates, the first thing that the JobTracker does is
-spawn a Reducer job on the DataNode/Tasktracker nodes for each "key" in the job
-result. In this case we have three keys: "black, white, sheep" in our result,
-so three Reducers are spawned: one for each key. The Map jobs shuffle the keys
-out to the respective Reduce jobs. Then the Reduce job code runs and the sum is
-calculated, and the result is written into the HDFS filesystem in a common
-directory. In the above example the output directory is specified as
-"/home/ben/output" so all the Reducers will write their results into this
-directory under different filenames; the file names being "part-00xx", where x
-is the Reducer/partition number.
-
-
-## Hadoop Deployment
-
-![Alt text](images/hadoop.png "Reduce job")
-
-The above diagram depicts a typical Hadoop deployment. The NameNode and
-JobTracker usually reside on the same machine, though they can run on seperate
-machines. The DataNodes and TaskTrackers run on the same node. The size of the
-cluster can be scaled to thousands of nodes with petabytes of storage.
-
-The above deployment model provides redundancy for data as the HDFS filesytem
-takes care of the data replication. The only single point of failure are the
-NameNode and the JobTracker. If any of these components fail the cluster will
-not be usable.
-
-
-## Making Hadoop HA
-
-To make the Hadoop cluster highly available we would have to add another set of
-JobTracker/NameNodes, and make sure that the data updated by the master is also
-somehow also updated by the client. In case of failure of the primary node, the
-secondary node takes over that role.
-
-The first thing that has to be dealt with is the data held by the NameNode. As
-we recall, the NameNode holds all of the metadata about the filesystem, so any
-update to the metadata should also be reflected on the secondary NameNode's
-metadata copy. The synchronization of the primary and seconary NameNode
-metadata is handled by the Quorum Journal Manager.
-
-
-### Quorum Journal Manager
-
-![Alt text](images/qjm.png "QJM")
-
-As the figure above shows the Quorum Journal manager consists of the journal
-manager client and journal manager nodes. The journal manager clients reside
-on the same node as the NameNodes, and in case of primary node, collects all the
-edits logs happening on the NameNode and sends it out to the Journal nodes. The
-journal manager client residing on the secondary namenode regurlary contacts
-the journal nodes and updates its local metadata to be consistant with the
-master node. In case of primary node failure the secondary NameNode updates
-itself to the latest edit logs and takes over as the primary NameNode.
-
-
-### Zookeeper
-
-Apart from data consistency, a distributed cluster system also needs a
-mechanism for centralized coordination. For example, there should be a way for
-the secondary node to tell if the primary node is running properly, and if not
-it has to take up the role of the primary. Zookeeper provides Hadoop with a
-mechanism to coordinate in this way.
-
-![Alt text](images/zookeeper.png "Zookeeper")
-
-As the figure above shows, the Zookeeper services are client/server baseds
-service. The server component itself is replicated over a set of machines that
-comprise the service. In short, high availability is built into the Zookeeper
-servers.
-
-For Hadoop, two Zookeeper clients have been built: ZKFC (Zookeeper Failover
-Controller), one for the NameNode and one for JobTracker. These clients run on
-the same machines as the NameNode/JobTrackers themselves.
-
-When a ZKFC client is started, it establishes a connection with one of the
-Zookeeper nodes and obtains a session ID. The client then keeps a health check
-on the NameNode/JobTracker and sends heartbeats to the ZooKeeper.
-
-If the ZKFC client detects a failure of the NameNode/JobTracker, it removes
-itself from the ZooKeeper active/standby election, and the other ZKFC client
-fences the node/service and takes over the primary role.
-
-
-## Hadoop HA Deployment
-
-![Alt text](images/hadoopha.png "Hadoop_HA")
-
-The above diagram depicts a fully HA Hadoop Cluster with no single point of
-failure and automated failover.
-
-
-## Deploying Hadoop Clusters with Ansible
-
-Setting up a Hadoop cluster without HA itself can be a challenging and
-time-consuming task, and with HA, things become even more difficult.
-
-Ansible can automate the whole process of deploying a Hadoop cluster with or
-without HA with the same playbook, in a matter of minutes. This can be used for
-quick environment rebuild, or in case of disaster or node failures, recovery
-time can be greatly reduced with Ansible automation.
-
-Let's have a look to see how this is done.
-
-## Deploying a Hadoop cluster with HA
-
-### Prerequisites
-
-These playbooks have been tested using Ansible v1.2, and CentOS 6.x (64 bit)
-
-Modify group_vars/all to choose the network interface for Hadoop communication.
-
-Optionally you change the Hadoop-specific parameters like ports or directories
-by editing group_vars/all file.
-
-Before launching the deployment playbook make sure the inventory file (hosts)
-is set up properly. Here's a sample: 
-
-		[hadoop_master_primary]
-		zhadoop1
-
-		[hadoop_master_secondary]
-		zhadoop2
-
-		[hadoop_masters:children]
-		hadoop_master_primary
-		hadoop_master_secondary
-
-		[hadoop_slaves]
-		hadoop1
-		hadoop2
-		hadoop3
-
-		[qjournal_servers]
-		zhadoop1
-		zhadoop2
-		zhadoop3
-
-		[zookeeper_servers]
-		zhadoop1 zoo_id=1
-		zhadoop2 zoo_id=2
-		zhadoop3 zoo_id=3 
-
-Once the inventory is set up, the Hadoop cluster can be setup using the following
-command
-
-		ansible-playbook -i hosts site.yml
-
-Once deployed, we can check the cluster sanity in different ways. To check the
-status of the HDFS filesystem and a report on all the DataNodes, log in as the
-'hdfs' user on any Hadoop master server, and issue the following command to get
-the report:
-
-		hadoop dfsadmin -report
-
-To check the sanity of HA, first log in as the 'hdfs' user on any Hadoop master
-server and get the current active/standby NameNode servers this way:
-
-		-bash-4.1$ hdfs haadmin -getServiceState zhadoop1
-			active
-		-bash-4.1$ hdfs haadmin -getServiceState zhadoop2
-			standby
-
-To get the state of the JobTracker process login as the 'mapred' user on any
-Hadoop master server and issue the following command:
-
-		-bash-4.1$ hadoop mrhaadmin -getServiceState hadoop1
-			standby
-		-bash-4.1$ hadoop mrhaadmin -getServiceState hadoop2
-			active
-
-Once you have determined which server is active and which is standby, you can
-kill the NameNode/JobTracker process on the server listed as active and issue
-the same commands again, and you should see that the standby has been promoted
-to the active state. Later, you can restart the killed process and see that node
-listed as standby.
-
-### Running a MapReduce Job
-
-To deploy the mapreduce job run the following script from any of the hadoop
-master nodes as user 'hdfs'. The job would count the number of occurance of the
-word 'hello' in the given inputfile. Eg: su - hdfs -c "/tmp/job.sh"
-
-		#!/bin/bash		
-		cat > /tmp/inputfile << EOF
-		hello
-		sf
-		sdf
-		hello
-		sdf
-		sdf
-		EOF
-		hadoop fs -put /tmp/inputfile /inputfile
-		hadoop jar /usr/lib/hadoop-0.20-mapreduce/hadoop-examples.jar grep /inputfile /outputfile 'hello'
-		hadoop fs -get /outputfile /tmp/outputfile/
-
-To verify the result, read the file on the server located at
-/tmp/outputfile/part-00000, which should give you the count.
-
-## Scale the Cluster
-
-When the Hadoop cluster reaches its maximum capacity, it can be scaled by
-adding nodes.  This can be easily accomplished by adding the node hostname to
-the Ansible inventory under the hadoop_slaves group, and running the following
-command:
-
-		ansible-playbook -i hosts site.yml --tags=slaves
-
-## Deploy a non-HA Hadoop Cluster
-
-The following diagram illustrates a standalone Hadoop cluster.
-
-To deploy this cluster fill in the inventory file as follows: 
-
-		[hadoop_all:children]
-		hadoop_masters
-		hadoop_slaves
-
-		[hadoop_master_primary]
-		zhadoop1
-
-		[hadoop_master_secondary]
-
-		[hadoop_masters:children]
-		hadoop_master_primary
-		hadoop_master_secondary
-
-		[hadoop_slaves]
-		hadoop1
-		hadoop2
-		hadoop3
-
-Edit the group_vars/all file to disable HA:
-
-		ha_enabled: False
-
-And run the following command:
-
-		ansible-playbook -i hosts site.yml 
-
-The validity of the cluster can be checked by running the same MapReduce job
-that has documented above for an HA Hadoop cluster.
-
--- a/hadoop/group_vars/all
+++ b/hadoop/group_vars/all
@ -1,56 +0,0 @@
-# Defaults to the first ethernet interface. Change this to:
-#
-#  iface: eth1
-#
-# ...to override.
-#
-iface: '{{ ansible_default_ipv4.interface }}'
-
-ha_enabled: False 
-
-hadoop:
-    
-#Variables for <core-site_xml> - common
-    
-    fs_default_FS_port: 8020
-    nameservice_id: mycluster4
-
-#Variables for <hdfs-site_xml>     
-    
-    dfs_permissions_superusergroup: hdfs
-    dfs_namenode_name_dir: 
-       - /namedir1/
-       - /namedir2/
-    dfs_replication: 3
-    dfs_namenode_handler_count: 50
-    dfs_blocksize: 67108864
-    dfs_datanode_data_dir:
-       - /datadir1/
-       - /datadir2/
-    dfs_datanode_address_port: 50010
-    dfs_datanode_http_address_port: 50075
-    dfs_datanode_ipc_address_port:  50020
-    dfs_namenode_http_address_port: 50070
-    dfs_ha_zkfc_port: 8019
-    qjournal_port: 8485  
-    qjournal_http_port: 8480 
-    dfs_journalnode_edits_dir: /journaldir/
-    zookeeper_clientport: 2181
-    zookeeper_leader_port: 2888
-    zookeeper_election_port: 3888
-
-#Variables for <mapred-site_xml> - common
-    mapred_job_tracker_ha_servicename: myjt4 
-    mapred_job_tracker_http_address_port: 50030
-    mapred_task_tracker_http_address_port: 50060 
-    mapred_job_tracker_port: 8021
-    mapred_ha_jobtracker_rpc-address_port: 8023
-    mapred_ha_zkfc_port: 8018
-    mapred_job_tracker_persist_jobstatus_dir: /jobdir/
-    mapred_local_dir:
-      - /mapred1/
-      - /mapred2/
-
-
-     
-
--- a/hadoop/hosts
+++ b/hadoop/hosts
@ -1,31 +0,0 @@
-[hadoop_all:children]
-hadoop_masters
-hadoop_slaves
-qjournal_servers
-zookeeper_servers
-
-[hadoop_master_primary]
-hadoop1
-
-[hadoop_master_secondary]
-hadoop2
-
-[hadoop_masters:children]
-hadoop_master_primary
-hadoop_master_secondary
-
-[hadoop_slaves]
-hadoop1
-hadoop2
-hadoop3
-
-[qjournal_servers]
-hadoop1
-hadoop2
-hadoop3
-
-[zookeeper_servers]
-hadoop1 zoo_id=1
-hadoop2 zoo_id=2
-hadoop3 zoo_id=3
-
--- a/hadoop/images/hadoop.png
+++ b/hadoop/images/hadoop.png
--- a/hadoop/images/hadoopha.png
+++ b/hadoop/images/hadoopha.png
--- a/hadoop/images/hdfs.png
+++ b/hadoop/images/hdfs.png
--- a/hadoop/images/map.png
+++ b/hadoop/images/map.png
--- a/hadoop/images/qjm.png
+++ b/hadoop/images/qjm.png
--- a/hadoop/images/reduce.png
+++ b/hadoop/images/reduce.png
--- a/hadoop/images/zookeeper.png
+++ b/hadoop/images/zookeeper.png
--- a/hadoop/playbooks/inputfile
+++ b/hadoop/playbooks/inputfile
@ -1,19 +0,0 @@
-asdf
-sdf
-sdf
-sd
-f
-sf
-sdf
-sd
-fsd
-hello
-asf
-sf
-sd
-fsd
-f
-sdf
-sd
-hello
-
--- a/hadoop/playbooks/job.yml
+++ b/hadoop/playbooks/job.yml
@ -1,21 +0,0 @@
---
-# Launch Job to count occurance of a word.
-
- hosts: $server
-  user: root
-  tasks:
-   - name: copy the file 
-     copy: src=inputfile dest=/tmp/inputfile 
- 
-   - name: upload the file
-     shell: su - hdfs -c "hadoop fs -put /tmp/inputfile /inputfile"
-
-   - name: Run the MapReduce job to count the occurance of word hello
-     shell: su - hdfs -c "hadoop jar /usr/lib/hadoop-0.20-mapreduce/hadoop-examples.jar grep /inputfile /outputfile 'hello'"
-
-   - name: Fetch the outputfile to local tmp dir
-     shell: su - hdfs -c "hadoop fs -get /outputfile /tmp/outputfile"
-
-   - name: Get the outputfile to ansible server
-     fetch: dest=/tmp src=/tmp/outputfile/part-00000
- 
--- a/hadoop/roles/common/files/etc/cloudera-CDH4.repo
+++ b/hadoop/roles/common/files/etc/cloudera-CDH4.repo
@ -1,5 +0,0 @@
-[cloudera-cdh4]
-name=Cloudera's Distribution for Hadoop, Version 4
-baseurl=http://archive.cloudera.com/cdh4/redhat/6/x86_64/cdh/4/
-gpgkey = http://archive.cloudera.com/cdh4/redhat/6/x86_64/cdh/RPM-GPG-KEY-cloudera    
-gpgcheck = 1
--- a/hadoop/roles/common/handlers/main.yml
+++ b/hadoop/roles/common/handlers/main.yml
@ -1,2 +0,0 @@
- name: restart iptables
-  service: name=iptables state=restarted
--- a/hadoop/roles/common/tasks/common.yml
+++ b/hadoop/roles/common/tasks/common.yml
@ -1,28 +0,0 @@
---
-# The playbook for common tasks
-
- name: Deploy the Cloudera Repository
-  copy: src=etc/cloudera-CDH4.repo dest=/etc/yum.repos.d/cloudera-CDH4.repo
-
- name: Install the libselinux-python package
-  yum: name=libselinux-python state=installed
-
- name: Install the openjdk package
-  yum: name=java-1.6.0-openjdk state=installed
-
- name: Create a directory for java 
-  file: state=directory path=/usr/java/
-
- name: Create a link for java
-  file: src=/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre state=link path=/usr/java/default
-
- name: Create the hosts file for all machines
-  template: src=etc/hosts.j2 dest=/etc/hosts
-
- name: Disable SELinux in conf file
-  selinux: state=disabled
-
- name: Create the iptables file for all machines
-  template: src=iptables.j2 dest=/etc/sysconfig/iptables
-  notify: restart iptables
-
--- a/hadoop/roles/common/tasks/main.yml
+++ b/hadoop/roles/common/tasks/main.yml
@ -1,5 +0,0 @@
---
-# The playbook for common tasks
-
- include: common.yml tags=slaves
-
--- a/hadoop/roles/common/templates/etc/hosts.j2
+++ b/hadoop/roles/common/templates/etc/hosts.j2
@ -1,5 +0,0 @@
-127.0.0.1 localhost
-{% for host in groups.all %}
-{{ hostvars[host]['ansible_' + iface].ipv4.address }}  {{ host }}
-{% endfor %}
-
--- a/hadoop/roles/common/templates/hadoop_conf/core-site.xml.j2
+++ b/hadoop/roles/common/templates/hadoop_conf/core-site.xml.j2
@ -1,25 +0,0 @@
-<?xml version="1.0"?>
-<!--
-  Licensed to the Apache Software Foundation (ASF) under one or more
-  contributor license agreements.  See the NOTICE file distributed with
-  this work for additional information regarding copyright ownership.
-  The ASF licenses this file to You under the Apache License, Version 2.0
-  (the "License"); you may not use this file except in compliance with
-  the License.  You may obtain a copy of the License at
-
-      http://www.apache.org/licenses/LICENSE-2.0
-
-  Unless required by applicable law or agreed to in writing, software
-  distributed under the License is distributed on an "AS IS" BASIS,
-  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-  See the License for the specific language governing permissions and
-  limitations under the License.
-->
-<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
-
-<configuration>
-<property>
-      <name>fs.defaultFS</name>
-      <value>hdfs://{{ hostvars[groups['hadoop_masters'][0]]['ansible_hostname'] + ':' ~ hadoop['fs_default_FS_port'] }}/</value>
-  </property>
-</configuration>
--- a/hadoop/roles/common/templates/hadoop_conf/hadoop-metrics.properties.j2
+++ b/hadoop/roles/common/templates/hadoop_conf/hadoop-metrics.properties.j2
@ -1,75 +0,0 @@
-# Configuration of the "dfs" context for null
-dfs.class=org.apache.hadoop.metrics.spi.NullContext
-
-# Configuration of the "dfs" context for file
-#dfs.class=org.apache.hadoop.metrics.file.FileContext
-#dfs.period=10
-#dfs.fileName=/tmp/dfsmetrics.log
-
-# Configuration of the "dfs" context for ganglia
-# Pick one: Ganglia 3.0 (former) or Ganglia 3.1 (latter)
-# dfs.class=org.apache.hadoop.metrics.ganglia.GangliaContext
-# dfs.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
-# dfs.period=10
-# dfs.servers=localhost:8649
-
-
-# Configuration of the "mapred" context for null
-mapred.class=org.apache.hadoop.metrics.spi.NullContext
-
-# Configuration of the "mapred" context for file
-#mapred.class=org.apache.hadoop.metrics.file.FileContext
-#mapred.period=10
-#mapred.fileName=/tmp/mrmetrics.log
-
-# Configuration of the "mapred" context for ganglia
-# Pick one: Ganglia 3.0 (former) or Ganglia 3.1 (latter)
-# mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext
-# mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
-# mapred.period=10
-# mapred.servers=localhost:8649
-
-
-# Configuration of the "jvm" context for null
-#jvm.class=org.apache.hadoop.metrics.spi.NullContext
-
-# Configuration of the "jvm" context for file
-#jvm.class=org.apache.hadoop.metrics.file.FileContext
-#jvm.period=10
-#jvm.fileName=/tmp/jvmmetrics.log
-
-# Configuration of the "jvm" context for ganglia
-# jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext
-# jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
-# jvm.period=10
-# jvm.servers=localhost:8649
-
-# Configuration of the "rpc" context for null
-rpc.class=org.apache.hadoop.metrics.spi.NullContext
-
-# Configuration of the "rpc" context for file
-#rpc.class=org.apache.hadoop.metrics.file.FileContext
-#rpc.period=10
-#rpc.fileName=/tmp/rpcmetrics.log
-
-# Configuration of the "rpc" context for ganglia
-# rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContext
-# rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
-# rpc.period=10
-# rpc.servers=localhost:8649
-
-
-# Configuration of the "ugi" context for null
-ugi.class=org.apache.hadoop.metrics.spi.NullContext
-
-# Configuration of the "ugi" context for file
-#ugi.class=org.apache.hadoop.metrics.file.FileContext
-#ugi.period=10
-#ugi.fileName=/tmp/ugimetrics.log
-
-# Configuration of the "ugi" context for ganglia
-# ugi.class=org.apache.hadoop.metrics.ganglia.GangliaContext
-# ugi.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
-# ugi.period=10
-# ugi.servers=localhost:8649
-
--- a/hadoop/roles/common/templates/hadoop_conf/hadoop-metrics2.properties.j2
+++ b/hadoop/roles/common/templates/hadoop_conf/hadoop-metrics2.properties.j2
@ -1,44 +0,0 @@
-#
-#   Licensed to the Apache Software Foundation (ASF) under one or more
-#   contributor license agreements.  See the NOTICE file distributed with
-#   this work for additional information regarding copyright ownership.
-#   The ASF licenses this file to You under the Apache License, Version 2.0
-#   (the "License"); you may not use this file except in compliance with
-#   the License.  You may obtain a copy of the License at
-#
-#       http://www.apache.org/licenses/LICENSE-2.0
-#
-#   Unless required by applicable law or agreed to in writing, software
-#   distributed under the License is distributed on an "AS IS" BASIS,
-#   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-#   See the License for the specific language governing permissions and
-#   limitations under the License.
-#
-
-# syntax: [prefix].[source|sink].[instance].[options]
-# See javadoc of package-info.java for org.apache.hadoop.metrics2 for details
-
-*.sink.file.class=org.apache.hadoop.metrics2.sink.FileSink
-# default sampling period, in seconds
-*.period=10
-
-# The namenode-metrics.out will contain metrics from all context
-#namenode.sink.file.filename=namenode-metrics.out
-# Specifying a special sampling period for namenode:
-#namenode.sink.*.period=8
-
-#datanode.sink.file.filename=datanode-metrics.out
-
-# the following example split metrics of different
-# context to different sinks (in this case files)
-#jobtracker.sink.file_jvm.context=jvm
-#jobtracker.sink.file_jvm.filename=jobtracker-jvm-metrics.out
-#jobtracker.sink.file_mapred.context=mapred
-#jobtracker.sink.file_mapred.filename=jobtracker-mapred-metrics.out
-
-#tasktracker.sink.file.filename=tasktracker-metrics.out
-
-#maptask.sink.file.filename=maptask-metrics.out
-
-#reducetask.sink.file.filename=reducetask-metrics.out
-
--- a/hadoop/roles/common/templates/hadoop_conf/hdfs-site.xml.j2
+++ b/hadoop/roles/common/templates/hadoop_conf/hdfs-site.xml.j2
@ -1,57 +0,0 @@
-<?xml version="1.0"?>
-<!--
-  Licensed to the Apache Software Foundation (ASF) under one or more
-  contributor license agreements.  See the NOTICE file distributed with
-  this work for additional information regarding copyright ownership.
-  The ASF licenses this file to You under the Apache License, Version 2.0
-  (the "License"); you may not use this file except in compliance with
-  the License.  You may obtain a copy of the License at
-
-      http://www.apache.org/licenses/LICENSE-2.0
-
-  Unless required by applicable law or agreed to in writing, software
-  distributed under the License is distributed on an "AS IS" BASIS,
-  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-  See the License for the specific language governing permissions and
-  limitations under the License.
-->
-<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
-
-<configuration>
-  <property>
-    <name>dfs.blocksize</name>
-    <value>{{ hadoop['dfs_blocksize'] }}</value>
-  </property>
-  <property>
-    <name>dfs.permissions.superusergroup</name>
-    <value>{{ hadoop['dfs_permissions_superusergroup'] }}</value>
-  </property>
-  <property>
-    <name>dfs.namenode.http.address</name>
-    <value>0.0.0.0:{{ hadoop['dfs_namenode_http_address_port'] }}</value>
-  </property>
-   <property>
-    <name>dfs.datanode.address</name>
-    <value>0.0.0.0:{{ hadoop['dfs_datanode_address_port'] }}</value>
-  </property>
-  <property>
-    <name>dfs.datanode.http.address</name>
-    <value>0.0.0.0:{{ hadoop['dfs_datanode_http_address_port'] }}</value>
-  </property>
-  <property>
-    <name>dfs.datanode.ipc.address</name>
-    <value>0.0.0.0:{{ hadoop['dfs_datanode_ipc_address_port'] }}</value>
-  </property>
-  <property>
-    <name>dfs.replication</name>
-    <value>{{ hadoop['dfs_replication'] }}</value>
-  </property>
-  <property>
-    <name>dfs.namenode.name.dir</name>
-    <value>{{ hadoop['dfs_namenode_name_dir'] | join(',') }}</value>
-  </property>
-  <property>
-    <name>dfs.datanode.data.dir</name>
-    <value>{{ hadoop['dfs_datanode_data_dir'] | join(',') }}</value>
-  </property>
-</configuration>
--- a/hadoop/roles/common/templates/hadoop_conf/log4j.properties.j2
+++ b/hadoop/roles/common/templates/hadoop_conf/log4j.properties.j2
@ -1,219 +0,0 @@
-# Copyright 2011 The Apache Software Foundation
-# 
-# Licensed to the Apache Software Foundation (ASF) under one
-# or more contributor license agreements.  See the NOTICE file
-# distributed with this work for additional information
-# regarding copyright ownership.  The ASF licenses this file
-# to you under the Apache License, Version 2.0 (the
-# "License"); you may not use this file except in compliance
-# with the License.  You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-# Define some default values that can be overridden by system properties
-hadoop.root.logger=INFO,console
-hadoop.log.dir=.
-hadoop.log.file=hadoop.log
-
-# Define the root logger to the system property "hadoop.root.logger".
-log4j.rootLogger=${hadoop.root.logger}, EventCounter
-
-# Logging Threshold
-log4j.threshold=ALL
-
-# Null Appender
-log4j.appender.NullAppender=org.apache.log4j.varia.NullAppender
-
-#
-# Rolling File Appender - cap space usage at 5gb.
-#
-hadoop.log.maxfilesize=256MB
-hadoop.log.maxbackupindex=20
-log4j.appender.RFA=org.apache.log4j.RollingFileAppender
-log4j.appender.RFA.File=${hadoop.log.dir}/${hadoop.log.file}
-
-log4j.appender.RFA.MaxFileSize=${hadoop.log.maxfilesize}
-log4j.appender.RFA.MaxBackupIndex=${hadoop.log.maxbackupindex}
-
-log4j.appender.RFA.layout=org.apache.log4j.PatternLayout
-
-# Pattern format: Date LogLevel LoggerName LogMessage
-log4j.appender.RFA.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n
-# Debugging Pattern format
-#log4j.appender.RFA.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} (%F:%M(%L)) - %m%n
-
-
-#
-# Daily Rolling File Appender
-#
-
-log4j.appender.DRFA=org.apache.log4j.DailyRollingFileAppender
-log4j.appender.DRFA.File=${hadoop.log.dir}/${hadoop.log.file}
-
-# Rollver at midnight
-log4j.appender.DRFA.DatePattern=.yyyy-MM-dd
-
-# 30-day backup
-#log4j.appender.DRFA.MaxBackupIndex=30
-log4j.appender.DRFA.layout=org.apache.log4j.PatternLayout
-
-# Pattern format: Date LogLevel LoggerName LogMessage
-log4j.appender.DRFA.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n
-# Debugging Pattern format
-#log4j.appender.DRFA.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} (%F:%M(%L)) - %m%n
-
-
-#
-# console
-# Add "console" to rootlogger above if you want to use this 
-#
-
-log4j.appender.console=org.apache.log4j.ConsoleAppender
-log4j.appender.console.target=System.err
-log4j.appender.console.layout=org.apache.log4j.PatternLayout
-log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{2}: %m%n
-
-#
-# TaskLog Appender
-#
-
-#Default values
-hadoop.tasklog.taskid=null
-hadoop.tasklog.iscleanup=false
-hadoop.tasklog.noKeepSplits=4
-hadoop.tasklog.totalLogFileSize=100
-hadoop.tasklog.purgeLogSplits=true
-hadoop.tasklog.logsRetainHours=12
-
-log4j.appender.TLA=org.apache.hadoop.mapred.TaskLogAppender
-log4j.appender.TLA.taskId=${hadoop.tasklog.taskid}
-log4j.appender.TLA.isCleanup=${hadoop.tasklog.iscleanup}
-log4j.appender.TLA.totalLogFileSize=${hadoop.tasklog.totalLogFileSize}
-
-log4j.appender.TLA.layout=org.apache.log4j.PatternLayout
-log4j.appender.TLA.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n
-
-#
-# HDFS block state change log from block manager
-#
-# Uncomment the following to suppress normal block state change
-# messages from BlockManager in NameNode.
-#log4j.logger.BlockStateChange=WARN
-
-#
-#Security appender
-#
-hadoop.security.logger=INFO,NullAppender
-hadoop.security.log.maxfilesize=256MB
-hadoop.security.log.maxbackupindex=20
-log4j.category.SecurityLogger=${hadoop.security.logger}
-hadoop.security.log.file=SecurityAuth-${user.name}.audit
-log4j.appender.RFAS=org.apache.log4j.RollingFileAppender 
-log4j.appender.RFAS.File=${hadoop.log.dir}/${hadoop.security.log.file}
-log4j.appender.RFAS.layout=org.apache.log4j.PatternLayout
-log4j.appender.RFAS.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n
-log4j.appender.RFAS.MaxFileSize=${hadoop.security.log.maxfilesize}
-log4j.appender.RFAS.MaxBackupIndex=${hadoop.security.log.maxbackupindex}
-
-#
-# Daily Rolling Security appender
-#
-log4j.appender.DRFAS=org.apache.log4j.DailyRollingFileAppender 
-log4j.appender.DRFAS.File=${hadoop.log.dir}/${hadoop.security.log.file}
-log4j.appender.DRFAS.layout=org.apache.log4j.PatternLayout
-log4j.appender.DRFAS.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n
-log4j.appender.DRFAS.DatePattern=.yyyy-MM-dd
-
-#
-# hdfs audit logging
-#
-hdfs.audit.logger=INFO,NullAppender
-hdfs.audit.log.maxfilesize=256MB
-hdfs.audit.log.maxbackupindex=20
-log4j.logger.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=${hdfs.audit.logger}
-log4j.additivity.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=false
-log4j.appender.RFAAUDIT=org.apache.log4j.RollingFileAppender
-log4j.appender.RFAAUDIT.File=${hadoop.log.dir}/hdfs-audit.log
-log4j.appender.RFAAUDIT.layout=org.apache.log4j.PatternLayout
-log4j.appender.RFAAUDIT.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n
-log4j.appender.RFAAUDIT.MaxFileSize=${hdfs.audit.log.maxfilesize}
-log4j.appender.RFAAUDIT.MaxBackupIndex=${hdfs.audit.log.maxbackupindex}
-
-#
-# mapred audit logging
-#
-mapred.audit.logger=INFO,NullAppender
-mapred.audit.log.maxfilesize=256MB
-mapred.audit.log.maxbackupindex=20
-log4j.logger.org.apache.hadoop.mapred.AuditLogger=${mapred.audit.logger}
-log4j.additivity.org.apache.hadoop.mapred.AuditLogger=false
-log4j.appender.MRAUDIT=org.apache.log4j.RollingFileAppender
-log4j.appender.MRAUDIT.File=${hadoop.log.dir}/mapred-audit.log
-log4j.appender.MRAUDIT.layout=org.apache.log4j.PatternLayout
-log4j.appender.MRAUDIT.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n
-log4j.appender.MRAUDIT.MaxFileSize=${mapred.audit.log.maxfilesize}
-log4j.appender.MRAUDIT.MaxBackupIndex=${mapred.audit.log.maxbackupindex}
-
-# Custom Logging levels
-
-#log4j.logger.org.apache.hadoop.mapred.JobTracker=DEBUG
-#log4j.logger.org.apache.hadoop.mapred.TaskTracker=DEBUG
-#log4j.logger.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=DEBUG
-
-# Jets3t library
-log4j.logger.org.jets3t.service.impl.rest.httpclient.RestS3Service=ERROR
-
-#
-# Event Counter Appender
-# Sends counts of logging messages at different severity levels to Hadoop Metrics.
-#
-log4j.appender.EventCounter=org.apache.hadoop.log.metrics.EventCounter
-
-#
-# Job Summary Appender 
-#
-# Use following logger to send summary to separate file defined by 
-# hadoop.mapreduce.jobsummary.log.file :
-# hadoop.mapreduce.jobsummary.logger=INFO,JSA
-# 
-hadoop.mapreduce.jobsummary.logger=${hadoop.root.logger}
-hadoop.mapreduce.jobsummary.log.file=hadoop-mapreduce.jobsummary.log
-hadoop.mapreduce.jobsummary.log.maxfilesize=256MB
-hadoop.mapreduce.jobsummary.log.maxbackupindex=20
-log4j.appender.JSA=org.apache.log4j.RollingFileAppender
-log4j.appender.JSA.File=${hadoop.log.dir}/${hadoop.mapreduce.jobsummary.log.file}
-log4j.appender.JSA.MaxFileSize=${hadoop.mapreduce.jobsummary.log.maxfilesize}
-log4j.appender.JSA.MaxBackupIndex=${hadoop.mapreduce.jobsummary.log.maxbackupindex}
-log4j.appender.JSA.layout=org.apache.log4j.PatternLayout
-log4j.appender.JSA.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{2}: %m%n
-log4j.logger.org.apache.hadoop.mapred.JobInProgress$JobSummary=${hadoop.mapreduce.jobsummary.logger}
-log4j.additivity.org.apache.hadoop.mapred.JobInProgress$JobSummary=false
-
-#
-# Yarn ResourceManager Application Summary Log 
-#
-# Set the ResourceManager summary log filename
-#yarn.server.resourcemanager.appsummary.log.file=rm-appsummary.log
-# Set the ResourceManager summary log level and appender
-#yarn.server.resourcemanager.appsummary.logger=INFO,RMSUMMARY
-
-# Appender for ResourceManager Application Summary Log
-# Requires the following properties to be set
-#    - hadoop.log.dir (Hadoop Log directory)
-#    - yarn.server.resourcemanager.appsummary.log.file (resource manager app summary log filename)
-#    - yarn.server.resourcemanager.appsummary.logger (resource manager app summary log level and appender)
-
-#log4j.logger.org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary=${yarn.server.resourcemanager.appsummary.logger}
-#log4j.additivity.org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary=false
-#log4j.appender.RMSUMMARY=org.apache.log4j.RollingFileAppender
-#log4j.appender.RMSUMMARY.File=${hadoop.log.dir}/${yarn.server.resourcemanager.appsummary.log.file}
-#log4j.appender.RMSUMMARY.MaxFileSize=256MB
-#log4j.appender.RMSUMMARY.MaxBackupIndex=20
-#log4j.appender.RMSUMMARY.layout=org.apache.log4j.PatternLayout
-#log4j.appender.RMSUMMARY.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n
--- a/hadoop/roles/common/templates/hadoop_conf/mapred-site.xml.j2
+++ b/hadoop/roles/common/templates/hadoop_conf/mapred-site.xml.j2
@ -1,22 +0,0 @@
-<configuration>
-  
-  <property>
-    <name>mapred.job.tracker</name>
-    <value>{{ hostvars[groups['hadoop_masters'][0]]['ansible_hostname'] }}:{{ hadoop['mapred_job_tracker_port'] }}</value>
-  </property>
-
-  <property>
-    <name>mapred.local.dir</name>
-    <value>{{ hadoop["mapred_local_dir"] | join(',') }}</value>
-  </property>
-  
-  <property>
-    <name>mapred.task.tracker.http.address</name>
-    <value>0.0.0.0:{{ hadoop['mapred_task_tracker_http_address_port'] }}</value>
-  </property>
-  <property>
-    <name>mapred.job.tracker.http.address</name>
-    <value>0.0.0.0:{{ hadoop['mapred_job_tracker_http_address_port'] }}</value>
-  </property>
-
-</configuration>
--- a/hadoop/roles/common/templates/hadoop_conf/slaves.j2
+++ b/hadoop/roles/common/templates/hadoop_conf/slaves.j2
@ -1,3 +0,0 @@
-{% for host in groups['hadoop_slaves'] %}
-{{ host }}
-{% endfor %}
--- a/hadoop/roles/common/templates/hadoop_conf/ssl-client.xml.example.j2
+++ b/hadoop/roles/common/templates/hadoop_conf/ssl-client.xml.example.j2
@ -1,80 +0,0 @@
-<?xml version="1.0"?>
-<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
-<!--
-   Licensed to the Apache Software Foundation (ASF) under one or more
-   contributor license agreements.  See the NOTICE file distributed with
-   this work for additional information regarding copyright ownership.
-   The ASF licenses this file to You under the Apache License, Version 2.0
-   (the "License"); you may not use this file except in compliance with
-   the License.  You may obtain a copy of the License at
-
-       http://www.apache.org/licenses/LICENSE-2.0
-
-   Unless required by applicable law or agreed to in writing, software
-   distributed under the License is distributed on an "AS IS" BASIS,
-   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-   See the License for the specific language governing permissions and
-   limitations under the License.
-->
-<configuration>
-
-<property>
-  <name>ssl.client.truststore.location</name>
-  <value></value>
-  <description>Truststore to be used by clients like distcp. Must be
-  specified.
-  </description>
-</property>
-
-<property>
-  <name>ssl.client.truststore.password</name>
-  <value></value>
-  <description>Optional. Default value is "".
-  </description>
-</property>
-
-<property>
-  <name>ssl.client.truststore.type</name>
-  <value>jks</value>
-  <description>Optional. The keystore file format, default value is "jks".
-  </description>
-</property>
-
-<property>
-  <name>ssl.client.truststore.reload.interval</name>
-  <value>10000</value>
-  <description>Truststore reload check interval, in milliseconds.
-  Default value is 10000 (10 seconds).
-  </description>
-</property>
-
-<property>
-  <name>ssl.client.keystore.location</name>
-  <value></value>
-  <description>Keystore to be used by clients like distcp. Must be
-  specified.
-  </description>
-</property>
-
-<property>
-  <name>ssl.client.keystore.password</name>
-  <value></value>
-  <description>Optional. Default value is "".
-  </description>
-</property>
-
-<property>
-  <name>ssl.client.keystore.keypassword</name>
-  <value></value>
-  <description>Optional. Default value is "".
-  </description>
-</property>
-
-<property>
-  <name>ssl.client.keystore.type</name>
-  <value>jks</value>
-  <description>Optional. The keystore file format, default value is "jks".
-  </description>
-</property>
-
-</configuration>
--- a/hadoop/roles/common/templates/hadoop_conf/ssl-server.xml.example.j2
+++ b/hadoop/roles/common/templates/hadoop_conf/ssl-server.xml.example.j2
@ -1,77 +0,0 @@
-<?xml version="1.0"?>
-<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
-<!--
-   Licensed to the Apache Software Foundation (ASF) under one or more
-   contributor license agreements.  See the NOTICE file distributed with
-   this work for additional information regarding copyright ownership.
-   The ASF licenses this file to You under the Apache License, Version 2.0
-   (the "License"); you may not use this file except in compliance with
-   the License.  You may obtain a copy of the License at
-
-       http://www.apache.org/licenses/LICENSE-2.0
-
-   Unless required by applicable law or agreed to in writing, software
-   distributed under the License is distributed on an "AS IS" BASIS,
-   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-   See the License for the specific language governing permissions and
-   limitations under the License.
-->
-<configuration>
-
-<property>
-  <name>ssl.server.truststore.location</name>
-  <value></value>
-  <description>Truststore to be used by NN and DN. Must be specified.
-  </description>
-</property>
-
-<property>
-  <name>ssl.server.truststore.password</name>
-  <value></value>
-  <description>Optional. Default value is "".
-  </description>
-</property>
-
-<property>
-  <name>ssl.server.truststore.type</name>
-  <value>jks</value>
-  <description>Optional. The keystore file format, default value is "jks".
-  </description>
-</property>
-
-<property>
-  <name>ssl.server.truststore.reload.interval</name>
-  <value>10000</value>
-  <description>Truststore reload check interval, in milliseconds.
-  Default value is 10000 (10 seconds).
-</property>
-
-<property>
-  <name>ssl.server.keystore.location</name>
-  <value></value>
-  <description>Keystore to be used by NN and DN. Must be specified.
-  </description>
-</property>
-
-<property>
-  <name>ssl.server.keystore.password</name>
-  <value></value>
-  <description>Must be specified.
-  </description>
-</property>
-
-<property>
-  <name>ssl.server.keystore.keypassword</name>
-  <value></value>
-  <description>Must be specified.
-  </description>
-</property>
-
-<property>
-  <name>ssl.server.keystore.type</name>
-  <value>jks</value>
-  <description>Optional. The keystore file format, default value is "jks".
-  </description>
-</property>
-
-</configuration>
--- a/hadoop/roles/common/templates/hadoop_ha_conf/core-site.xml.j2
+++ b/hadoop/roles/common/templates/hadoop_ha_conf/core-site.xml.j2
@ -1,25 +0,0 @@
-<?xml version="1.0"?>
-<!--
-  Licensed to the Apache Software Foundation (ASF) under one or more
-  contributor license agreements.  See the NOTICE file distributed with
-  this work for additional information regarding copyright ownership.
-  The ASF licenses this file to You under the Apache License, Version 2.0
-  (the "License"); you may not use this file except in compliance with
-  the License.  You may obtain a copy of the License at
-
-      http://www.apache.org/licenses/LICENSE-2.0
-
-  Unless required by applicable law or agreed to in writing, software
-  distributed under the License is distributed on an "AS IS" BASIS,
-  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-  See the License for the specific language governing permissions and
-  limitations under the License.
-->
-<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
-
-<configuration>
-<property>
-      <name>fs.defaultFS</name>
-      <value>hdfs://{{ hadoop['nameservice_id'] }}/</value>
-  </property>
-</configuration>
--- a/hadoop/roles/common/templates/hadoop_ha_conf/hadoop-metrics.properties.j2
+++ b/hadoop/roles/common/templates/hadoop_ha_conf/hadoop-metrics.properties.j2
@ -1,75 +0,0 @@
-# Configuration of the "dfs" context for null
-dfs.class=org.apache.hadoop.metrics.spi.NullContext
-
-# Configuration of the "dfs" context for file
-#dfs.class=org.apache.hadoop.metrics.file.FileContext
-#dfs.period=10
-#dfs.fileName=/tmp/dfsmetrics.log
-
-# Configuration of the "dfs" context for ganglia
-# Pick one: Ganglia 3.0 (former) or Ganglia 3.1 (latter)
-# dfs.class=org.apache.hadoop.metrics.ganglia.GangliaContext
-# dfs.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
-# dfs.period=10
-# dfs.servers=localhost:8649
-
-
-# Configuration of the "mapred" context for null
-mapred.class=org.apache.hadoop.metrics.spi.NullContext
-
-# Configuration of the "mapred" context for file
-#mapred.class=org.apache.hadoop.metrics.file.FileContext
-#mapred.period=10
-#mapred.fileName=/tmp/mrmetrics.log
-
-# Configuration of the "mapred" context for ganglia
-# Pick one: Ganglia 3.0 (former) or Ganglia 3.1 (latter)
-# mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext
-# mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
-# mapred.period=10
-# mapred.servers=localhost:8649
-
-
-# Configuration of the "jvm" context for null
-#jvm.class=org.apache.hadoop.metrics.spi.NullContext
-
-# Configuration of the "jvm" context for file
-#jvm.class=org.apache.hadoop.metrics.file.FileContext
-#jvm.period=10
-#jvm.fileName=/tmp/jvmmetrics.log
-
-# Configuration of the "jvm" context for ganglia
-# jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext
-# jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
-# jvm.period=10
-# jvm.servers=localhost:8649
-
-# Configuration of the "rpc" context for null
-rpc.class=org.apache.hadoop.metrics.spi.NullContext
-
-# Configuration of the "rpc" context for file
-#rpc.class=org.apache.hadoop.metrics.file.FileContext
-#rpc.period=10
-#rpc.fileName=/tmp/rpcmetrics.log
-
-# Configuration of the "rpc" context for ganglia
-# rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContext
-# rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
-# rpc.period=10
-# rpc.servers=localhost:8649
-
-
-# Configuration of the "ugi" context for null
-ugi.class=org.apache.hadoop.metrics.spi.NullContext
-
-# Configuration of the "ugi" context for file
-#ugi.class=org.apache.hadoop.metrics.file.FileContext
-#ugi.period=10
-#ugi.fileName=/tmp/ugimetrics.log
-
-# Configuration of the "ugi" context for ganglia
-# ugi.class=org.apache.hadoop.metrics.ganglia.GangliaContext
-# ugi.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
-# ugi.period=10
-# ugi.servers=localhost:8649
-
--- a/hadoop/roles/common/templates/hadoop_ha_conf/hadoop-metrics2.properties.j2
+++ b/hadoop/roles/common/templates/hadoop_ha_conf/hadoop-metrics2.properties.j2
@ -1,44 +0,0 @@
-#
-#   Licensed to the Apache Software Foundation (ASF) under one or more
-#   contributor license agreements.  See the NOTICE file distributed with
-#   this work for additional information regarding copyright ownership.
-#   The ASF licenses this file to You under the Apache License, Version 2.0
-#   (the "License"); you may not use this file except in compliance with
-#   the License.  You may obtain a copy of the License at
-#
-#       http://www.apache.org/licenses/LICENSE-2.0
-#
-#   Unless required by applicable law or agreed to in writing, software
-#   distributed under the License is distributed on an "AS IS" BASIS,
-#   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-#   See the License for the specific language governing permissions and
-#   limitations under the License.
-#
-
-# syntax: [prefix].[source|sink].[instance].[options]
-# See javadoc of package-info.java for org.apache.hadoop.metrics2 for details
-
-*.sink.file.class=org.apache.hadoop.metrics2.sink.FileSink
-# default sampling period, in seconds
-*.period=10
-
-# The namenode-metrics.out will contain metrics from all context
-#namenode.sink.file.filename=namenode-metrics.out
-# Specifying a special sampling period for namenode:
-#namenode.sink.*.period=8
-
-#datanode.sink.file.filename=datanode-metrics.out
-
-# the following example split metrics of different
-# context to different sinks (in this case files)
-#jobtracker.sink.file_jvm.context=jvm
-#jobtracker.sink.file_jvm.filename=jobtracker-jvm-metrics.out
-#jobtracker.sink.file_mapred.context=mapred
-#jobtracker.sink.file_mapred.filename=jobtracker-mapred-metrics.out
-
-#tasktracker.sink.file.filename=tasktracker-metrics.out
-
-#maptask.sink.file.filename=maptask-metrics.out
-
-#reducetask.sink.file.filename=reducetask-metrics.out
-
--- a/hadoop/roles/common/templates/hadoop_ha_conf/hdfs-site.xml.j2
+++ b/hadoop/roles/common/templates/hadoop_ha_conf/hdfs-site.xml.j2
@ -1,103 +0,0 @@
-<?xml version="1.0"?>
-<!--
-  Licensed to the Apache Software Foundation (ASF) under one or more
-  contributor license agreements.  See the NOTICE file distributed with
-  this work for additional information regarding copyright ownership.
-  The ASF licenses this file to You under the Apache License, Version 2.0
-  (the "License"); you may not use this file except in compliance with
-  the License.  You may obtain a copy of the License at
-
-      http://www.apache.org/licenses/LICENSE-2.0
-
-  Unless required by applicable law or agreed to in writing, software
-  distributed under the License is distributed on an "AS IS" BASIS,
-  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-  See the License for the specific language governing permissions and
-  limitations under the License.
-->
-<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
-<configuration>
-   <property>
-    <name>dfs.nameservices</name>
-    <value>{{ hadoop['nameservice_id'] }}</value>
-  </property>
-  <property>
-  <name>dfs.ha.namenodes.{{ hadoop['nameservice_id'] }}</name>
-  <value>{{ groups.hadoop_masters | join(',') }}</value>
-</property>
-  <property>
-    <name>dfs.blocksize</name>
-    <value>{{ hadoop['dfs_blocksize'] }}</value>
-  </property>
-  <property>
-    <name>dfs.permissions.superusergroup</name>
-    <value>{{ hadoop['dfs_permissions_superusergroup'] }}</value>
-  </property>
-<property>
-  <name>dfs.ha.automatic-failover.enabled</name>
-  <value>true</value>
-</property>
-<property>
-  <name>ha.zookeeper.quorum</name>
-  <value>{{ groups.zookeeper_servers | join(':' ~ hadoop['zookeeper_clientport'] + ',') }}:{{ hadoop['zookeeper_clientport'] }}</value>
-</property>
-
-{% for host in groups['hadoop_masters'] %}
-  <property>
-    <name>dfs.namenode.rpc-address.{{ hadoop['nameservice_id'] }}.{{ host }}</name>
-    <value>{{ host }}:{{ hadoop['fs_default_FS_port'] }}</value>
-  </property>
-{% endfor %}
-{% for host in groups['hadoop_masters'] %}
-  <property>
-    <name>dfs.namenode.http-address.{{ hadoop['nameservice_id'] }}.{{ host }}</name>
-    <value>{{ host }}:{{ hadoop['dfs_namenode_http_address_port'] }}</value>
-  </property>
-{% endfor %}
-  <property>
-    <name>dfs.namenode.shared.edits.dir</name>
-    <value>qjournal://{{ groups.qjournal_servers | join(':' ~ hadoop['qjournal_port'] + ';') }}:{{ hadoop['qjournal_port'] }}/{{ hadoop['nameservice_id'] }}</value>
-  </property>
-  <property>
-   <name>dfs.journalnode.edits.dir</name>
-   <value>{{ hadoop['dfs_journalnode_edits_dir'] }}</value>
- </property>  
-  <property>
-   <name>dfs.client.failover.proxy.provider.{{ hadoop['nameservice_id'] }}</name>
-   <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
- </property>
- <property>
-   <name>dfs.ha.fencing.methods</name>
-   <value>shell(/bin/true )</value>
- </property>
-
-<property>
-    <name>dfs.ha.zkfc.port</name>
-    <value>{{ hadoop['dfs_ha_zkfc_port'] }}</value>
-  </property>
-  
- <property>
-    <name>dfs.datanode.address</name>
-    <value>0.0.0.0:{{ hadoop['dfs_datanode_address_port'] }}</value>
-  </property>
-  <property>
-    <name>dfs.datanode.http.address</name>
-    <value>0.0.0.0:{{ hadoop['dfs_datanode_http_address_port'] }}</value>
-  </property>
-  <property>
-    <name>dfs.datanode.ipc.address</name>
-    <value>0.0.0.0:{{ hadoop['dfs_datanode_ipc_address_port'] }}</value>
-  </property>
-  <property>
-    <name>dfs.replication</name>
-    <value>{{ hadoop['dfs_replication'] }}</value>
-  </property>
-  <property>
-    <name>dfs.namenode.name.dir</name>
-    <value>{{ hadoop['dfs_namenode_name_dir'] | join(',') }}</value>
-  </property>
-  <property>
-    <name>dfs.datanode.data.dir</name>
-    <value>{{ hadoop['dfs_datanode_data_dir'] | join(',') }}</value>
-  </property>
-</configuration>
--- a/hadoop/roles/common/templates/hadoop_ha_conf/log4j.properties.j2
+++ b/hadoop/roles/common/templates/hadoop_ha_conf/log4j.properties.j2
@ -1,219 +0,0 @@
-# Copyright 2011 The Apache Software Foundation
-# 
-# Licensed to the Apache Software Foundation (ASF) under one
-# or more contributor license agreements.  See the NOTICE file
-# distributed with this work for additional information
-# regarding copyright ownership.  The ASF licenses this file
-# to you under the Apache License, Version 2.0 (the
-# "License"); you may not use this file except in compliance
-# with the License.  You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-# Define some default values that can be overridden by system properties
-hadoop.root.logger=INFO,console
-hadoop.log.dir=.
-hadoop.log.file=hadoop.log
-
-# Define the root logger to the system property "hadoop.root.logger".
-log4j.rootLogger=${hadoop.root.logger}, EventCounter
-
-# Logging Threshold
-log4j.threshold=ALL
-
-# Null Appender
-log4j.appender.NullAppender=org.apache.log4j.varia.NullAppender
-
-#
-# Rolling File Appender - cap space usage at 5gb.
-#
-hadoop.log.maxfilesize=256MB
-hadoop.log.maxbackupindex=20
-log4j.appender.RFA=org.apache.log4j.RollingFileAppender
-log4j.appender.RFA.File=${hadoop.log.dir}/${hadoop.log.file}
-
-log4j.appender.RFA.MaxFileSize=${hadoop.log.maxfilesize}
-log4j.appender.RFA.MaxBackupIndex=${hadoop.log.maxbackupindex}
-
-log4j.appender.RFA.layout=org.apache.log4j.PatternLayout
-
-# Pattern format: Date LogLevel LoggerName LogMessage
-log4j.appender.RFA.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n
-# Debugging Pattern format
-#log4j.appender.RFA.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} (%F:%M(%L)) - %m%n
-
-
-#
-# Daily Rolling File Appender
-#
-
-log4j.appender.DRFA=org.apache.log4j.DailyRollingFileAppender
-log4j.appender.DRFA.File=${hadoop.log.dir}/${hadoop.log.file}
-
-# Rollver at midnight
-log4j.appender.DRFA.DatePattern=.yyyy-MM-dd
-
-# 30-day backup
-#log4j.appender.DRFA.MaxBackupIndex=30
-log4j.appender.DRFA.layout=org.apache.log4j.PatternLayout
-
-# Pattern format: Date LogLevel LoggerName LogMessage
-log4j.appender.DRFA.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n
-# Debugging Pattern format
-#log4j.appender.DRFA.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} (%F:%M(%L)) - %m%n
-
-
-#
-# console
-# Add "console" to rootlogger above if you want to use this 
-#
-
-log4j.appender.console=org.apache.log4j.ConsoleAppender
-log4j.appender.console.target=System.err
-log4j.appender.console.layout=org.apache.log4j.PatternLayout
-log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{2}: %m%n
-
-#
-# TaskLog Appender
-#
-
-#Default values
-hadoop.tasklog.taskid=null
-hadoop.tasklog.iscleanup=false
-hadoop.tasklog.noKeepSplits=4
-hadoop.tasklog.totalLogFileSize=100
-hadoop.tasklog.purgeLogSplits=true
-hadoop.tasklog.logsRetainHours=12
-
-log4j.appender.TLA=org.apache.hadoop.mapred.TaskLogAppender
-log4j.appender.TLA.taskId=${hadoop.tasklog.taskid}
-log4j.appender.TLA.isCleanup=${hadoop.tasklog.iscleanup}
-log4j.appender.TLA.totalLogFileSize=${hadoop.tasklog.totalLogFileSize}
-
-log4j.appender.TLA.layout=org.apache.log4j.PatternLayout
-log4j.appender.TLA.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n
-
-#
-# HDFS block state change log from block manager
-#
-# Uncomment the following to suppress normal block state change
-# messages from BlockManager in NameNode.
-#log4j.logger.BlockStateChange=WARN
-
-#
-#Security appender
-#
-hadoop.security.logger=INFO,NullAppender
-hadoop.security.log.maxfilesize=256MB
-hadoop.security.log.maxbackupindex=20
-log4j.category.SecurityLogger=${hadoop.security.logger}
-hadoop.security.log.file=SecurityAuth-${user.name}.audit
-log4j.appender.RFAS=org.apache.log4j.RollingFileAppender 
-log4j.appender.RFAS.File=${hadoop.log.dir}/${hadoop.security.log.file}
-log4j.appender.RFAS.layout=org.apache.log4j.PatternLayout
-log4j.appender.RFAS.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n
-log4j.appender.RFAS.MaxFileSize=${hadoop.security.log.maxfilesize}
-log4j.appender.RFAS.MaxBackupIndex=${hadoop.security.log.maxbackupindex}
-
-#
-# Daily Rolling Security appender
-#
-log4j.appender.DRFAS=org.apache.log4j.DailyRollingFileAppender 
-log4j.appender.DRFAS.File=${hadoop.log.dir}/${hadoop.security.log.file}
-log4j.appender.DRFAS.layout=org.apache.log4j.PatternLayout
-log4j.appender.DRFAS.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n
-log4j.appender.DRFAS.DatePattern=.yyyy-MM-dd
-
-#
-# hdfs audit logging
-#
-hdfs.audit.logger=INFO,NullAppender
-hdfs.audit.log.maxfilesize=256MB
-hdfs.audit.log.maxbackupindex=20
-log4j.logger.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=${hdfs.audit.logger}
-log4j.additivity.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=false
-log4j.appender.RFAAUDIT=org.apache.log4j.RollingFileAppender
-log4j.appender.RFAAUDIT.File=${hadoop.log.dir}/hdfs-audit.log
-log4j.appender.RFAAUDIT.layout=org.apache.log4j.PatternLayout
-log4j.appender.RFAAUDIT.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n
-log4j.appender.RFAAUDIT.MaxFileSize=${hdfs.audit.log.maxfilesize}
-log4j.appender.RFAAUDIT.MaxBackupIndex=${hdfs.audit.log.maxbackupindex}
-
-#
-# mapred audit logging
-#
-mapred.audit.logger=INFO,NullAppender
-mapred.audit.log.maxfilesize=256MB
-mapred.audit.log.maxbackupindex=20
-log4j.logger.org.apache.hadoop.mapred.AuditLogger=${mapred.audit.logger}
-log4j.additivity.org.apache.hadoop.mapred.AuditLogger=false
-log4j.appender.MRAUDIT=org.apache.log4j.RollingFileAppender
-log4j.appender.MRAUDIT.File=${hadoop.log.dir}/mapred-audit.log
-log4j.appender.MRAUDIT.layout=org.apache.log4j.PatternLayout
-log4j.appender.MRAUDIT.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n
-log4j.appender.MRAUDIT.MaxFileSize=${mapred.audit.log.maxfilesize}
-log4j.appender.MRAUDIT.MaxBackupIndex=${mapred.audit.log.maxbackupindex}
-
-# Custom Logging levels
-
-#log4j.logger.org.apache.hadoop.mapred.JobTracker=DEBUG
-#log4j.logger.org.apache.hadoop.mapred.TaskTracker=DEBUG
-#log4j.logger.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=DEBUG
-
-# Jets3t library
-log4j.logger.org.jets3t.service.impl.rest.httpclient.RestS3Service=ERROR
-
-#
-# Event Counter Appender
-# Sends counts of logging messages at different severity levels to Hadoop Metrics.
-#
-log4j.appender.EventCounter=org.apache.hadoop.log.metrics.EventCounter
-
-#
-# Job Summary Appender 
-#
-# Use following logger to send summary to separate file defined by 
-# hadoop.mapreduce.jobsummary.log.file :
-# hadoop.mapreduce.jobsummary.logger=INFO,JSA
-# 
-hadoop.mapreduce.jobsummary.logger=${hadoop.root.logger}
-hadoop.mapreduce.jobsummary.log.file=hadoop-mapreduce.jobsummary.log
-hadoop.mapreduce.jobsummary.log.maxfilesize=256MB
-hadoop.mapreduce.jobsummary.log.maxbackupindex=20
-log4j.appender.JSA=org.apache.log4j.RollingFileAppender
-log4j.appender.JSA.File=${hadoop.log.dir}/${hadoop.mapreduce.jobsummary.log.file}
-log4j.appender.JSA.MaxFileSize=${hadoop.mapreduce.jobsummary.log.maxfilesize}
-log4j.appender.JSA.MaxBackupIndex=${hadoop.mapreduce.jobsummary.log.maxbackupindex}
-log4j.appender.JSA.layout=org.apache.log4j.PatternLayout
-log4j.appender.JSA.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{2}: %m%n
-log4j.logger.org.apache.hadoop.mapred.JobInProgress$JobSummary=${hadoop.mapreduce.jobsummary.logger}
-log4j.additivity.org.apache.hadoop.mapred.JobInProgress$JobSummary=false
-
-#
-# Yarn ResourceManager Application Summary Log 
-#
-# Set the ResourceManager summary log filename
-#yarn.server.resourcemanager.appsummary.log.file=rm-appsummary.log
-# Set the ResourceManager summary log level and appender
-#yarn.server.resourcemanager.appsummary.logger=INFO,RMSUMMARY
-
-# Appender for ResourceManager Application Summary Log
-# Requires the following properties to be set
-#    - hadoop.log.dir (Hadoop Log directory)
-#    - yarn.server.resourcemanager.appsummary.log.file (resource manager app summary log filename)
-#    - yarn.server.resourcemanager.appsummary.logger (resource manager app summary log level and appender)
-
-#log4j.logger.org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary=${yarn.server.resourcemanager.appsummary.logger}
-#log4j.additivity.org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary=false
-#log4j.appender.RMSUMMARY=org.apache.log4j.RollingFileAppender
-#log4j.appender.RMSUMMARY.File=${hadoop.log.dir}/${yarn.server.resourcemanager.appsummary.log.file}
-#log4j.appender.RMSUMMARY.MaxFileSize=256MB
-#log4j.appender.RMSUMMARY.MaxBackupIndex=20
-#log4j.appender.RMSUMMARY.layout=org.apache.log4j.PatternLayout
-#log4j.appender.RMSUMMARY.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n
--- a/hadoop/roles/common/templates/hadoop_ha_conf/mapred-site.xml.j2
+++ b/hadoop/roles/common/templates/hadoop_ha_conf/mapred-site.xml.j2
@ -1,120 +0,0 @@
-<configuration>
-  
-  <property>
-    <name>mapred.job.tracker</name>
-    <value>{{ hadoop['mapred_job_tracker_ha_servicename'] }}</value>
-  </property>
-
-   <property>
-    <name>mapred.jobtrackers.{{ hadoop['mapred_job_tracker_ha_servicename'] }}</name>
-    <value>{{ groups['hadoop_masters'] | join(',') }}</value>
-    <description>Comma-separated list of JobTracker IDs.</description>
-  </property>
- 
- <property>
-    <name>mapred.ha.automatic-failover.enabled</name>
-    <value>true</value>
-  </property>
-
-  <property>
-    <name>mapred.ha.zkfc.port</name>
-    <value>{{ hadoop['mapred_ha_zkfc_port'] }}</value> 
-  </property>
-
- <property>
-    <name>mapred.ha.fencing.methods</name>
-    <value>shell(/bin/true)</value>
- </property>
-
-<property>
-  <name>ha.zookeeper.quorum</name>
-  <value>{{ groups.zookeeper_servers | join(':' ~ hadoop['zookeeper_clientport'] + ',') }}:{{ hadoop['zookeeper_clientport'] }}</value>
-</property>
-
-{% for host in groups['hadoop_masters'] %} 
-   <property>
-    <name>mapred.jobtracker.rpc-address.{{ hadoop['mapred_job_tracker_ha_servicename'] }}.{{ host }}</name> 
-    <value>{{ host }}:{{ hadoop['mapred_job_tracker_port'] }}</value>
-   </property>
-{% endfor %}
-{% for host in groups['hadoop_masters'] %} 
- <property>
-    <name>mapred.job.tracker.http.address.{{ hadoop['mapred_job_tracker_ha_servicename'] }}.{{ host }}</name>
-    <value>0.0.0.0:{{ hadoop['mapred_job_tracker_http_address_port'] }}</value>
-  </property>
-{% endfor %}
-{% for host in groups['hadoop_masters'] %} 
-   <property>
-    <name>mapred.ha.jobtracker.rpc-address.{{ hadoop['mapred_job_tracker_ha_servicename'] }}.{{ host }}</name> 
-    <value>{{ host }}:{{ hadoop['mapred_ha_jobtracker_rpc-address_port'] }}</value>
-   </property>
-{% endfor %}
-{% for host in groups['hadoop_masters'] %} 
- <property>
-    <name>mapred.ha.jobtracker.http-redirect-address.{{ hadoop['mapred_job_tracker_ha_servicename'] }}.{{ host }}</name>
-    <value>{{ host }}:{{ hadoop['mapred_job_tracker_http_address_port'] }}</value>
-  </property>
-{% endfor %}
-
- <property>
-    <name>mapred.jobtracker.restart.recover</name>
-    <value>true</value>
-  </property>
-
-  <property>
-    <name>mapred.job.tracker.persist.jobstatus.active</name>
-    <value>true</value>
-  </property>
-
- <property>
-    <name>mapred.job.tracker.persist.jobstatus.hours</name>
-    <value>1</value>
-  </property>
-
-  <property>
-    <name>mapred.job.tracker.persist.jobstatus.dir</name>
-    <value>{{ hadoop['mapred_job_tracker_persist_jobstatus_dir'] }}</value>
-  </property>
-
-  <property>
-    <name>mapred.client.failover.proxy.provider.{{ hadoop['mapred_job_tracker_ha_servicename'] }}</name>
-    <value>org.apache.hadoop.mapred.ConfiguredFailoverProxyProvider</value>
-  </property>
-
-  <property>
-    <name>mapred.client.failover.max.attempts</name>
-    <value>15</value>
-  </property>
-
-  <property>
-    <name>mapred.client.failover.sleep.base.millis</name>
-    <value>500</value>
-  </property>
-
-  <property>
-    <name>mapred.client.failover.sleep.max.millis</name>
-    <value>1500</value>  
-  </property>
-
- <property>
-    <name>mapred.client.failover.connection.retries</name>
-    <value>0</value>  
-  </property>
-
- <property>
-    <name>mapred.client.failover.connection.retries.on.timeouts</name>
-    <value>0</value>  
-  </property>
-
-
-  <property>
-    <name>mapred.local.dir</name>
-    <value>{{ hadoop["mapred_local_dir"] | join(',') }}</value>
-  </property>
-  
-  <property>
-    <name>mapred.task.tracker.http.address</name>
-    <value>0.0.0.0:{{ hadoop['mapred_task_tracker_http_address_port'] }}</value>
-  </property>
-  
-</configuration>
--- a/hadoop/roles/common/templates/hadoop_ha_conf/slaves.j2
+++ b/hadoop/roles/common/templates/hadoop_ha_conf/slaves.j2
@ -1,3 +0,0 @@
-{% for host in groups['hadoop_slaves'] %}
-{{ host }}
-{% endfor %}
--- a/hadoop/roles/common/templates/hadoop_ha_conf/ssl-client.xml.example.j2
+++ b/hadoop/roles/common/templates/hadoop_ha_conf/ssl-client.xml.example.j2
@ -1,80 +0,0 @@
-<?xml version="1.0"?>
-<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
-<!--
-   Licensed to the Apache Software Foundation (ASF) under one or more
-   contributor license agreements.  See the NOTICE file distributed with
-   this work for additional information regarding copyright ownership.
-   The ASF licenses this file to You under the Apache License, Version 2.0
-   (the "License"); you may not use this file except in compliance with
-   the License.  You may obtain a copy of the License at
-
-       http://www.apache.org/licenses/LICENSE-2.0
-
-   Unless required by applicable law or agreed to in writing, software
-   distributed under the License is distributed on an "AS IS" BASIS,
-   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-   See the License for the specific language governing permissions and
-   limitations under the License.
-->
-<configuration>
-
-<property>
-  <name>ssl.client.truststore.location</name>
-  <value></value>
-  <description>Truststore to be used by clients like distcp. Must be
-  specified.
-  </description>
-</property>
-
-<property>
-  <name>ssl.client.truststore.password</name>
-  <value></value>
-  <description>Optional. Default value is "".
-  </description>
-</property>
-
-<property>
-  <name>ssl.client.truststore.type</name>
-  <value>jks</value>
-  <description>Optional. The keystore file format, default value is "jks".
-  </description>
-</property>
-
-<property>
-  <name>ssl.client.truststore.reload.interval</name>
-  <value>10000</value>
-  <description>Truststore reload check interval, in milliseconds.
-  Default value is 10000 (10 seconds).
-  </description>
-</property>
-
-<property>
-  <name>ssl.client.keystore.location</name>
-  <value></value>
-  <description>Keystore to be used by clients like distcp. Must be
-  specified.
-  </description>
-</property>
-
-<property>
-  <name>ssl.client.keystore.password</name>
-  <value></value>
-  <description>Optional. Default value is "".
-  </description>
-</property>
-
-<property>
-  <name>ssl.client.keystore.keypassword</name>
-  <value></value>
-  <description>Optional. Default value is "".
-  </description>
-</property>
-
-<property>
-  <name>ssl.client.keystore.type</name>
-  <value>jks</value>
-  <description>Optional. The keystore file format, default value is "jks".
-  </description>
-</property>
-
-</configuration>
--- a/hadoop/roles/common/templates/hadoop_ha_conf/ssl-server.xml.example.j2
+++ b/hadoop/roles/common/templates/hadoop_ha_conf/ssl-server.xml.example.j2
@ -1,77 +0,0 @@
-<?xml version="1.0"?>
-<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
-<!--
-   Licensed to the Apache Software Foundation (ASF) under one or more
-   contributor license agreements.  See the NOTICE file distributed with
-   this work for additional information regarding copyright ownership.
-   The ASF licenses this file to You under the Apache License, Version 2.0
-   (the "License"); you may not use this file except in compliance with
-   the License.  You may obtain a copy of the License at
-
-       http://www.apache.org/licenses/LICENSE-2.0
-
-   Unless required by applicable law or agreed to in writing, software
-   distributed under the License is distributed on an "AS IS" BASIS,
-   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-   See the License for the specific language governing permissions and
-   limitations under the License.
-->
-<configuration>
-
-<property>
-  <name>ssl.server.truststore.location</name>
-  <value></value>
-  <description>Truststore to be used by NN and DN. Must be specified.
-  </description>
-</property>
-
-<property>
-  <name>ssl.server.truststore.password</name>
-  <value></value>
-  <description>Optional. Default value is "".
-  </description>
-</property>
-
-<property>
-  <name>ssl.server.truststore.type</name>
-  <value>jks</value>
-  <description>Optional. The keystore file format, default value is "jks".
-  </description>
-</property>
-
-<property>
-  <name>ssl.server.truststore.reload.interval</name>
-  <value>10000</value>
-  <description>Truststore reload check interval, in milliseconds.
-  Default value is 10000 (10 seconds).
-</property>
-
-<property>
-  <name>ssl.server.keystore.location</name>
-  <value></value>
-  <description>Keystore to be used by NN and DN. Must be specified.
-  </description>
-</property>
-
-<property>
-  <name>ssl.server.keystore.password</name>
-  <value></value>
-  <description>Must be specified.
-  </description>
-</property>
-
-<property>
-  <name>ssl.server.keystore.keypassword</name>
-  <value></value>
-  <description>Must be specified.
-  </description>
-</property>
-
-<property>
-  <name>ssl.server.keystore.type</name>
-  <value>jks</value>
-  <description>Optional. The keystore file format, default value is "jks".
-  </description>
-</property>
-
-</configuration>
--- a/hadoop/roles/common/templates/iptables.j2
+++ b/hadoop/roles/common/templates/iptables.j2
@ -1,40 +0,0 @@
-# Firewall configuration written by system-config-firewall
-# Manual customization of this file is not recommended_
-*filter
-:INPUT ACCEPT [0:0]
-:FORWARD ACCEPT [0:0]
-:OUTPUT ACCEPT [0:0]
-{% if 'hadoop_masters' in group_names %}
-A INPUT -p tcp --dport {{ hadoop['fs_default_FS_port'] }} -j ACCEPT
-A INPUT -p tcp --dport {{ hadoop['dfs_namenode_http_address_port'] }} -j ACCEPT
-A INPUT -p tcp --dport {{ hadoop['mapred_job_tracker_port'] }} -j  ACCEPT
-A INPUT -p tcp --dport {{ hadoop['mapred_job_tracker_http_address_port'] }} -j ACCEPT
-A INPUT -p tcp --dport {{ hadoop['mapred_ha_jobtracker_rpc-address_port'] }} -j ACCEPT
-A INPUT -p tcp --dport {{ hadoop['mapred_ha_zkfc_port'] }} -j ACCEPT
-A INPUT -p tcp --dport {{ hadoop['dfs_ha_zkfc_port'] }} -j ACCEPT
-{% endif %}
-
-{% if 'hadoop_slaves' in group_names %}
-A INPUT -p tcp --dport {{ hadoop['dfs_datanode_address_port'] }} -j ACCEPT
-A INPUT -p tcp --dport {{ hadoop['dfs_datanode_http_address_port'] }} -j ACCEPT
-A INPUT -p tcp --dport {{ hadoop['dfs_datanode_ipc_address_port'] }} -j ACCEPT
-A INPUT -p tcp --dport {{ hadoop['mapred_task_tracker_http_address_port'] }} -j ACCEPT
-{% endif %}
-
-{% if 'qjournal_servers' in group_names %}
-A INPUT -p tcp --dport {{ hadoop['qjournal_port'] }} -j ACCEPT
-A INPUT -p tcp --dport {{ hadoop['qjournal_http_port'] }} -j ACCEPT
-{% endif %}
-
-{% if 'zookeeper_servers' in group_names %}
-A INPUT -p tcp --dport {{ hadoop['zookeeper_clientport'] }} -j ACCEPT
-A INPUT -p tcp --dport {{ hadoop['zookeeper_leader_port'] }} -j ACCEPT
-A INPUT -p tcp --dport {{ hadoop['zookeeper_election_port'] }} -j ACCEPT
-{% endif %}
-A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
-A INPUT -p icmp -j ACCEPT
-A INPUT -i lo -j ACCEPT
-A INPUT -m state --state NEW -m tcp -p tcp --dport 22 -j ACCEPT
-A INPUT -j REJECT --reject-with icmp-host-prohibited
-A FORWARD -j REJECT --reject-with icmp-host-prohibited
-COMMIT
--- a/hadoop/roles/hadoop_primary/handlers/main.yml
+++ b/hadoop/roles/hadoop_primary/handlers/main.yml
@ -1,14 +0,0 @@
---
-# Handlers for the hadoop master services
-
- name: restart hadoop master services
-  service: name=${item} state=restarted
-  with_items: 
-    - hadoop-0.20-mapreduce-jobtracker
-    - hadoop-hdfs-namenode
-
- name: restart hadoopha master services
-  service: name=${item} state=restarted
-  with_items: 
-    - hadoop-0.20-mapreduce-jobtrackerha
-    - hadoop-hdfs-namenode
--- a/hadoop/roles/hadoop_primary/tasks/hadoop_master.yml
+++ b/hadoop/roles/hadoop_primary/tasks/hadoop_master.yml
@ -1,38 +0,0 @@
---
-# Playbook for  Hadoop master servers
-
- name: Install the namenode and jobtracker packages
-  yum: name={{ item }} state=installed
-  with_items: 
-   - hadoop-0.20-mapreduce-jobtrackerha
-   - hadoop-hdfs-namenode
-   - hadoop-hdfs-zkfc
-   - hadoop-0.20-mapreduce-zkfc
-
- name: Copy the hadoop configuration files
-  template: src=roles/common/templates/hadoop_ha_conf/{{ item }}.j2 dest=/etc/hadoop/conf/{{ item }}
-  with_items: 
-   - core-site.xml
-   - hadoop-metrics.properties
-   - hadoop-metrics2.properties
-   - hdfs-site.xml
-   - log4j.properties
-   - mapred-site.xml
-   - slaves
-   - ssl-client.xml.example
-   - ssl-server.xml.example
-  notify: restart hadoopha master services
-
- name: Create the data directory for the namenode metadata
-  file: path={{ item }} owner=hdfs group=hdfs state=directory
-  with_items: hadoop.dfs_namenode_name_dir 
-
- name: Create the data directory for the jobtracker ha
-  file: path={{ item }} owner=mapred group=mapred state=directory
-  with_items: hadoop.mapred_job_tracker_persist_jobstatus_dir
-
- name: Format the namenode
-  shell: creates=/usr/lib/hadoop/namenode.formatted su - hdfs -c "hadoop namenode -format" && touch /usr/lib/hadoop/namenode.formatted
-
- name: start hadoop namenode services
-  service: name=hadoop-hdfs-namenode state=started
--- a/hadoop/roles/hadoop_primary/tasks/hadoop_master_no_ha.yml
+++ b/hadoop/roles/hadoop_primary/tasks/hadoop_master_no_ha.yml
@ -1,38 +0,0 @@
---
-# Playbook for  Hadoop master servers
-
- name: Install the namenode and jobtracker packages
-  yum: name={{ item }} state=installed
-  with_items: 
-   - hadoop-0.20-mapreduce-jobtracker
-   - hadoop-hdfs-namenode
-
- name: Copy the hadoop configuration files for no ha
-  template: src=roles/common/templates/hadoop_conf/{{ item }}.j2 dest=/etc/hadoop/conf/{{ item }}
-  with_items: 
-   - core-site.xml
-   - hadoop-metrics.properties
-   - hadoop-metrics2.properties
-   - hdfs-site.xml
-   - log4j.properties
-   - mapred-site.xml
-   - slaves
-   - ssl-client.xml.example
-   - ssl-server.xml.example
-  notify: restart hadoop master services
-
- name: Create the data directory for the namenode metadata
-  file: path={{ item }} owner=hdfs group=hdfs state=directory
-  with_items: hadoop.dfs_namenode_name_dir
-
- name: Format the namenode
-  shell: creates=/usr/lib/hadoop/namenode.formatted su - hdfs -c "hadoop namenode -format" && touch /usr/lib/hadoop/namenode.formatted
-
- name: start hadoop namenode services
-  service: name=hadoop-hdfs-namenode state=started
-
- name: Give permissions for mapred users
-  shell: creates=/usr/lib/hadoop/namenode.initialized su - hdfs -c "hadoop fs -chown hdfs:hadoop /"; su - hdfs -c "hadoop fs -chmod 0775 /" && touch /usr/lib/hadoop/namenode.initialized
-
- name: start hadoop jobtracker services
-  service: name=hadoop-0.20-mapreduce-jobtracker state=started
--- a/hadoop/roles/hadoop_primary/tasks/main.yml
+++ b/hadoop/roles/hadoop_primary/tasks/main.yml
@ -1,9 +0,0 @@
---
-# Playbook for Hadoop master primary servers
-
- include: hadoop_master.yml 
-  when: ha_enabled
-
- include: hadoop_master_no_ha.yml 
-  when: not ha_enabled
-
--- a/hadoop/roles/hadoop_secondary/handlers/main.yml
+++ b/hadoop/roles/hadoop_secondary/handlers/main.yml
@ -1,14 +0,0 @@
---
-# Handlers for the hadoop master services
-
- name: restart hadoop master services
-  service: name=${item} state=restarted
-  with_items: 
-    - hadoop-0.20-mapreduce-jobtracker
-    - hadoop-hdfs-namenode
-
- name: restart hadoopha master services
-  service: name=${item} state=restarted
-  with_items: 
-    - hadoop-0.20-mapreduce-jobtrackerha
-    - hadoop-hdfs-namenode
--- a/hadoop/roles/hadoop_secondary/tasks/main.yml
+++ b/hadoop/roles/hadoop_secondary/tasks/main.yml
@ -1,64 +0,0 @@
---
-# Playbook for  Hadoop master secondary server
-
-
- name: Install the namenode and jobtracker packages
-  yum: name=${item} state=installed
-  with_items: 
-   - hadoop-0.20-mapreduce-jobtrackerha
-   - hadoop-hdfs-namenode
-   - hadoop-hdfs-zkfc
-   - hadoop-0.20-mapreduce-zkfc
-
- name: Copy the hadoop configuration files
-  template: src=roles/common/templates/hadoop_ha_conf/{{ item }}.j2 dest=/etc/hadoop/conf/{{ item }}
-  with_items: 
-   - core-site.xml
-   - hadoop-metrics.properties
-   - hadoop-metrics2.properties
-   - hdfs-site.xml
-   - log4j.properties
-   - mapred-site.xml
-   - slaves
-   - ssl-client.xml.example
-   - ssl-server.xml.example
-  notify: restart hadoopha master services
-
- name: Create the data directory for the namenode metadata
-  file: path={{ item }} owner=hdfs group=hdfs state=directory
-  with_items: hadoop.dfs_namenode_name_dir
-
- name: Create the data directory for the jobtracker ha
-  file: path={{ item }} owner=mapred group=mapred state=directory
-  with_items: hadoop.mapred_job_tracker_persist_jobstatus_dir
-
-
- name: Initialize the secodary namenode
-  shell: creates=/usr/lib/hadoop/namenode.formatted su - hdfs -c "hadoop namenode -bootstrapStandby" && touch /usr/lib/hadoop/namenode.formatted
-
- name: start hadoop namenode services
-  service: name=hadoop-hdfs-namenode state=started
-
- name: Initialize the zkfc for namenode
-  shell: creates=/usr/lib/hadoop/zkfc.formatted su - hdfs -c "hdfs zkfc -formatZK" && touch /usr/lib/hadoop/zkfc.formatted
-
- name: start zkfc for namenodes
-  service: name=hadoop-hdfs-zkfc state=started
-  delegate_to: ${item}
-  with_items: groups.hadoop_masters
-
- name: Give permissions for mapred users
-  shell: creates=/usr/lib/hadoop/fs.initialized su - hdfs -c "hadoop fs -chown hdfs:hadoop /"; su - hdfs -c "hadoop fs -chmod 0774 /" && touch /usr/lib/hadoop/namenode.initialized
-
- name: Initialize the zkfc for jobtracker
-  shell: creates=/usr/lib/hadoop/zkfcjob.formatted su - mapred -c "hadoop mrzkfc -formatZK" && touch /usr/lib/hadoop/zkfcjob.formatted
-
- name: start zkfc for jobtracker 
-  service: name=hadoop-0.20-mapreduce-zkfc state=started
-  delegate_to: '{{ item }}'
-  with_items: groups.hadoop_masters
-
- name: start hadoop Jobtracker services
-  service: name=hadoop-0.20-mapreduce-jobtrackerha state=started
-  delegate_to: '{{ item }}'
-  with_items: groups.hadoop_masters
--- a/hadoop/roles/hadoop_slaves/handlers/main.yml
+++ b/hadoop/roles/hadoop_slaves/handlers/main.yml
@ -1,8 +0,0 @@
---
-# Handlers for the hadoop slave services
-
- name: restart hadoop slave services
-  service: name=${item} state=restarted
-  with_items: 
-    - hadoop-0.20-mapreduce-tasktracker
-    - hadoop-hdfs-datanode
--- a/hadoop/roles/hadoop_slaves/tasks/main.yml
+++ b/hadoop/roles/hadoop_slaves/tasks/main.yml
@ -1,4 +0,0 @@
---
-# Playbook for  Hadoop slave servers
-
- include: slaves.yml tags=slaves
--- a/hadoop/roles/hadoop_slaves/tasks/slaves.yml
+++ b/hadoop/roles/hadoop_slaves/tasks/slaves.yml
@ -1,53 +0,0 @@
---
-# Playbook for  Hadoop slave servers
-
- name: Install the datanode and tasktracker packages
-  yum: name=${item} state=installed
-  with_items: 
-   - hadoop-0.20-mapreduce-tasktracker
-   - hadoop-hdfs-datanode
-
- name: Copy the hadoop configuration files
-  template: src=roles/common/templates/hadoop_ha_conf/${item}.j2 dest=/etc/hadoop/conf/${item}
-  with_items: 
-   - core-site.xml
-   - hadoop-metrics.properties
-   - hadoop-metrics2.properties
-   - hdfs-site.xml
-   - log4j.properties
-   - mapred-site.xml
-   - slaves
-   - ssl-client.xml.example
-   - ssl-server.xml.example
-  when: ha_enabled
-  notify: restart hadoop slave services
-
- name: Copy the hadoop configuration files for non ha
-  template: src=roles/common/templates/hadoop_conf/${item}.j2 dest=/etc/hadoop/conf/${item}
-  with_items: 
-   - core-site.xml
-   - hadoop-metrics.properties
-   - hadoop-metrics2.properties
-   - hdfs-site.xml
-   - log4j.properties
-   - mapred-site.xml
-   - slaves
-   - ssl-client.xml.example
-   - ssl-server.xml.example
-  when: not ha_enabled
-  notify: restart hadoop slave services
-
- name: Create the data directory for the slave nodes to store the data
-  file: path={{ item }} owner=hdfs group=hdfs state=directory
-  with_items: hadoop.dfs_datanode_data_dir
-
- name: Create the data directory for the slave nodes for mapreduce
-  file: path={{ item }} owner=mapred group=mapred state=directory
-  with_items: hadoop.mapred_local_dir
-
- name: start hadoop slave services
-  service: name={{ item }} state=started
-  with_items:
-    - hadoop-0.20-mapreduce-tasktracker
-    - hadoop-hdfs-datanode
-  
--- a/hadoop/roles/qjournal_servers/handlers/main.yml
+++ b/hadoop/roles/qjournal_servers/handlers/main.yml
@ -1,5 +0,0 @@
---
-# The journal node handlers
-
- name: restart qjournal services
-  service: name=hadoop-hdfs-journalnode state=restarted
--- a/hadoop/roles/qjournal_servers/tasks/main.yml
+++ b/hadoop/roles/qjournal_servers/tasks/main.yml
@ -1,22 +0,0 @@
---
-# Playbook for the qjournal nodes
-
- name: Install the qjournal package
-  yum: name=hadoop-hdfs-journalnode state=installed
-  
- name: Create folder for Journaling
-  file: path={{ hadoop.dfs_journalnode_edits_dir }} state=directory owner=hdfs group=hdfs
-
- name: Copy the hadoop configuration files
-  template: src=roles/common/templates/hadoop_ha_conf/{{ item }}.j2 dest=/etc/hadoop/conf/{{ item }}
-  with_items:
-   - core-site.xml
-   - hadoop-metrics.properties
-   - hadoop-metrics2.properties
-   - hdfs-site.xml
-   - log4j.properties
-   - mapred-site.xml
-   - slaves
-   - ssl-client.xml.example
-   - ssl-server.xml.example
-  notify: restart qjournal services
--- a/hadoop/roles/zookeeper_servers/handlers/main.yml
+++ b/hadoop/roles/zookeeper_servers/handlers/main.yml
@ -1,5 +0,0 @@
---
-# Handler for the zookeeper services
-
- name: restart zookeeper
-  service: name=zookeeper-server state=restarted
--- a/hadoop/roles/zookeeper_servers/tasks/main.yml
+++ b/hadoop/roles/zookeeper_servers/tasks/main.yml
@ -1,13 +0,0 @@
---
-# The plays for zookeper daemons
-
- name: Install the zookeeper files
-  yum: name=zookeeper-server state=installed
-
- name: Copy the configuration file for zookeeper
-  template: src=zoo.cfg.j2 dest=/etc/zookeeper/conf/zoo.cfg
-  notify: restart zookeeper
-
- name: initialize the zookeper
-  shell: creates=/var/lib/zookeeper/myid service zookeeper-server init --myid=${zoo_id}
- 
--- a/hadoop/roles/zookeeper_servers/templates/zoo.cfg.j2
+++ b/hadoop/roles/zookeeper_servers/templates/zoo.cfg.j2
@ -1,9 +0,0 @@
-tickTime=2000
-dataDir=/var/lib/zookeeper/
-clientPort={{ hadoop['zookeeper_clientport'] }}
-initLimit=5
-syncLimit=2
-{% for host in groups['zookeeper_servers'] %}
-server.{{ hostvars[host].zoo_id }}={{ host }}:{{ hadoop['zookeeper_leader_port'] }}:{{ hadoop['zookeeper_election_port'] }}
-{% endfor %}
-
--- a/hadoop/site.yml
+++ b/hadoop/site.yml
@ -1,28 +0,0 @@
---
-# The main playbook to deploy the site
-
- hosts: hadoop_all
-  roles:
-   - common
-
- hosts: zookeeper_servers
-  roles:
-  - { role: zookeeper_servers, when: ha_enabled }
-
- hosts: qjournal_servers
-  roles:
-  - { role: qjournal_servers, when: ha_enabled }
-
- hosts: hadoop_master_primary
-  roles:
-  - { role: hadoop_primary }
-
- hosts: hadoop_master_secondary
-  roles:
-  - { role: hadoop_secondary, when: ha_enabled }
-
- hosts: hadoop_slaves
-  roles:
-  - { role: hadoop_slaves }
-
-