diff --git a/hadoop/README.md b/hadoop/README.md new file mode 100644 index 0000000..5b6f1f4 --- /dev/null +++ b/hadoop/README.md @@ -0,0 +1,359 @@ +# Deploying Hadoop Clusters using Ansible + +## Preface + +The playbooks in this example are designed to deploy a Hadoop cluster on a +CentOS 6 or RHEL 6 environment using Ansible. The playbooks can: + +1) Deploy a fully functional Hadoop cluster with HA and automatic failover. + +2) Deploy a fully functional Hadoop cluster with no HA. + +3) Deploy additional nodes to scale the cluster + +These playbooks require Ansible 1.2, CentOS 6 or RHEL 6 target machines, and install +the open-source Cloudera Hadoop Distribution (CDH) version 4. + +## Hadoop Components + +Hadoop is framework that allows processing of large datasets across large +clusters. The two main components that make up a Hadoop cluster are the HDFS +Filesystem and the MapReduce framework. Briefly, the HDFS filesystem is responsible +for storing data across the cluster nodes on its local disks. The MapReduce +jobs are the tasks that would run on these nodes to get a meaningful result +using the data stored on the HDFS filesystem. + +Let's have a closer look at each of these components. + +## HDFS + +![Alt text](/images/hdfs.png "HDFS") + +The above diagram illustrates an HDFS filesystem. The cluster consists of three +DataNodes which are responsible for storing/replicating data, while the NameNode +is a process which is responsible for storing the metadata for the entire +filesystem. As the example illustrates above, when a client wants to write a +file to the HDFS cluster it first contacts the NameNode and lets it know that +it want to write a file. The NameNode then decides where and how the file +should be saved and notifies the client about its decision. + +In the given example "File1" has a size of 128MB and the block size of the HDFS +filesystem is 64 MB. Hence, the NameNode instructs the client to break down the +file into two different blocks and write the first block to DataNode 1 and the +second block to DataNode 2. Upon receiving the notification from the NameNode, +the client contacts DataNode 1 and DataNode 2 directly to write the data. + +Once the data is recieved by the DataNodes, they replicate the block across the +other nodes. The number of nodes across which the data would be replicated is +based on the HDFS configuration, the default value being 3. Meanwhile the +NameNode updates its metadata with the entry of the new file "File1" and the +locations where the parts are stored. + +## MapReduce + +MapReduce is a Java application that utilizes the data stored in the +HDFS filesystem to get some useful and meaningful result. The whole job is +split into two parts: the "Map" job and the "Reduce" Job. + +Let's consider an example. In the previous step we had uploaded the "File1" +into the HDFS filesystem and the file was broken down into two different +blocks. Let's assume that the first block had the data "black sheep" in it and +the second block has the data "white sheep" in it. Now let's assume a client +wants to get count of all the words occurring in "File1". In order to get the +count, the first thing the client would have to do is write a "Map" program +then a "Reduce" program. + +Here's a psudeo code of how the Map and Reduce jobs might look: + + mapper (File1, file-contents): + for each word in file-contents: + emit (word, 1) + + reducer (word, values): + sum = 0 + for each value in values: + sum = sum + value + emit (word, sum) + +The work of the Map job is to go through all the words in the file and emit +a key/value pair. In this case the key is the word itself and value is always +1. + +The Reduce job is quite simple: it increments the value of sum by 1, for each +value it gets. + +Once the Map and Reduce jobs are ready, the client would instruct the +"JobTracker" (the process resposible for scheduling the jobs on the cluster) +to run the MapReduce job on "File1" + +Let's have closer look at the anotomy of a Map job. + + +![Alt text](/images/map.png "Map job") + +As the figure above shows, when the client instructs the JobTracker to run a +job on File1, the JobTracker first contacts the NameNode to determine where the +blocks of the File1 are. Then the JobTracker sends the Map job's JAR file down +to the nodes having the blocks, and the TaskTracker process those nodes to run +the application. + +In the above example, DataNode 1 and DataNode 2 havw the blocks, so the +TaskTrackers on those nodes run the Map jobs. Once the jobs are completed the +two nodes would have key/value results as below: + +MapJob Results: + + TaskTracker1: + "Black: 1" + "Sheep: 1" + + TaskTracker2: + "White: 1" + "Sheep: 1" + + +Once the Map phase is completed the JobTracker process initiates the Shuffle +and Reduce process. + +Let's have closer look at the Shuffle-Reduce job. + +![Alt text](/images/reduce.png "Reduce job") + +As the figure above demonstrates, the first thing that the JobTracker does is +spawn a Reducer job on the DataNode/Tasktracker nodes for each "key" in the job +result. In this case we have three keys: "black, white, sheep" in our result, +so three Reducers are spawned: one for each key. The Map jobs shuffle the keys +out to the respective Reduce jobs. Then the Reduce job code runs and the sum is +calculated, and the result is written into the HDFS filesystem in a common +directory. In the above example the output directory is specified as +"/home/ben/output" so all the Reducers will write their results into this +directory under different filenames; the file names being "part-00xx", where x +is the Reducer/partition number. + + +## Hadoop Deployment + +![Alt text](/images/hadoop.png "Reduce job") + +The above diagram depicts a typical Hadoop deployment. The NameNode and +JobTracker usually reside on the same machine, though they can run on seperate +machines. The DataNodes and TaskTrackers run on the same node. The size of the +cluster can be scaled to thousands of nodes with petabytes of storage. + +The above deployment model provides redundancy for data as the HDFS filesytem +takes care of the data replication. The only single point of failure are the +NameNode and the TaskTracker. If any of these components fail the cluster will +not be usable. + + +## Making Hadoop HA + +To make the Hadoop cluster highly available we would have to add another set of +JobTracker/NameNodes, and make sure that the data updated by the master is also +somehow also updated by the client. In case of failure of the primary node, the +secondary node takes over that role. + +The first thing that has to be dealt with is the data held by the NameNode. As +we recall, the NameNode holds all of the metadata about the filesystem, so any +update to the metadata should also be reflected on the secondary NameNode's +metadata copy. The synchronization of the primary and seconary NameNode +metadata is handled by the Quorum Journal Manager. + + +### Quorum Journal Manager + +![Alt text](/images/qjm.png "QJM") + +As the figure above shows the Quorum Journal manager consists of the journal +manager client and journal manager nodes. The journal manager clients reside +on the same node as the NameNodes, and in case of primary node, collects all the +edits logs happening on the NameNode and sends it out to the Journal nodes. The +journal manager client residing on the secondary namenode regurlary contacts +the journal nodes and updates its local metadata to be consistant with the +master node. In case of primary node failure the secondary NameNode updates +itself to the latest edit logs and takes over as the primary NameNode. + + +### Zookeeper + +Apart from data consistency, a distributed cluster system also needs a +mechanism for centralized coordination. For example, there should be a way for +the secondary node to tell if the primary node is running properly, and if not +it has to take up the role of the primary. Zookeeper provides Hadoop with a +mechanism to coordinate in this way. + +![Alt text](/images/zookeeper.png "Zookeeper") + +As the figure above shows, the Zookeeper services are client/server baseds +service. The server component itself is replicated over a set of machines that +comprise the service. In short, high availability is built into the Zookeeper +servers. + +For Hadoop, two Zookeeper clients have been built: ZKFC (Zookeeper Failover +Controller), one for the NameNode and one for JobTracker. These clients run on +the same machines as the NameNode/JobTrackers themselves. + +When a ZKFC client is started, it establishes a connection with one of the +Zookeeper nodes and obtains a session ID. The client then keeps a health check +on the NameNode/JobTracker and sends heartbeats to the ZooKeeper. + +If the ZKFC client detects a failure of the NameNode/JobTracker, it removes +itself from the ZooKeeper active/standby election, and the other ZKFC client +fences the node/service and takes over the primary role. + + +## Hadoop HA Deployment + +![Alt text](/images/hadoopha.png "Hadoop_HA") + +The above diagram depicts a fully HA Hadoop Cluster with no single point of +failure and automated failover. + + +## Deploying Hadoop Clusters with Ansible + +Setting up a Hadoop cluster without HA itself can be a challenging and +time-consuming task, and with HA, things become even more difficult. + +Ansible can automate the whole process of deploying a Hadoop cluster with or +without HA with the same playbook, in a matter of minutes. This can be used for +quick environment rebuild, or in case of disaster or node failures, recovery +time can be greatly reduced with Ansible automation. + +Let's have a look to see how this is done. + +## Deploying a Hadoop cluster with HA + +### Prerequisites + +These playbooks have been tested using Ansible v1.2, and CentOS 6.x (64 bit) + +Modify group_vars/all to choose the network interface for Hadoop communication. + +Optionally you change the Hadoop-specific parameters like ports or directories +by editing group_vars/all file. + +Before launching the deployment playbook make sure the inventory file (hosts) +is set up properly. Here's a sample: + + [hadoop_master_primary] + zhadoop1 + + [hadoop_master_secondary] + zhadoop2 + + [hadoop_masters:children] + hadoop_master_primary + hadoop_master_secondary + + [hadoop_slaves] + hadoop1 + hadoop2 + hadoop3 + + [qjournal_servers] + zhadoop1 + zhadoop2 + zhadoop3 + + [zookeeper_servers] + zhadoop1 zoo_id=1 + zhadoop2 zoo_id=2 + zhadoop3 zoo_id=3 + +Once the inventory is set up, the Hadoop cluster can be setup using the following +command + + ansible-playbook -i hosts site.yml + +Once deployed, we can check the cluster sanity in different ways. To check the +status of the HDFS filesystem and a report on all the DataNodes, log in as the +'hdfs' user on any Hadoop master server, and issue the following command to get +the report: + + hadoop dfsadmin -report + +To check the sanity of HA, first log in as the 'hdfs' user on any Hadoop master +server and get the current active/standby NameNode servers this way: + + -bash-4.1$ hdfs haadmin -getServiceState zhadoop1 + active + -bash-4.1$ hdfs haadmin -getServiceState zhadoop2 + standby + +To get the state of the JobTracker process login as the 'mapred' user on any +Hadoop master server and issue the following command: + + -bash-4.1$ hadoop mrhaadmin -getServiceState hadoop1 + standby + -bash-4.1$ hadoop mrhaadmin -getServiceState hadoop2 + active + +Once you have determined which server is active and which is standby, you can +kill the NameNode/JobTracker process on the server listed as active and issue +the same commands again, and you should see that the standby has been promoted +to the active state. Later, you can restart the killed process and see that node +listed as standby. + +### Running a MapReduce Job + +To deploy the mapreduce job run the following script from any of the hadoop +master nodes as user 'hdfs'. The job would count the number of occurance of the +word 'hello' in the given inputfile. Eg: su - hdfs -c "/tmp/job.sh" + + #!/bin/bash + cat > /tmp/inputfile << EOF + hello + sf + sdf + hello + sdf + sdf + EOF + hadoop fs -put /tmp/inputfile /inputfile + hadoop jar /usr/lib/hadoop-0.20-mapreduce/hadoop-examples.jar grep /inputfile /outputfile 'hello' + hadoop fs -get /outputfile /tmp/outputfile/ + +To verify the result, read the file on the server located at +/tmp/outputfile/part-00000, which should give you the count. + +## Scale the Cluster + +When the Hadoop cluster reaches its maximum capacity, it can be scaled by +adding nodes. This can be easily accomplished by adding the node hostname to +the Ansible inventory under the hadoop_slaves group, and running the following +command: + + ansible-playbook -i hosts site.yml --tags=slaves + +## Deploy a non-HA Hadoop Cluster + +The following diagram illustrates a standalone Hadoop cluster. + +To deploy this cluster fill in the inventory file as follows: + + [hadoop_master_primary] + zhadoop1 + + [hadoop_master_secondary] + + [hadoop_masters:children] + hadoop_master_primary + hadoop_master_secondary + + [hadoop_slaves] + hadoop1 + hadoop2 + hadoop3 + +Edit the group_vars/all file to disable HA: + + ha_enabled: False + +And run the following command: + + ansible-playbook -i hosts site.yml + +The validity of the cluster can be checked by running the same MapReduce job +that has documented above for an HA Hadoop cluster. + diff --git a/hadoop/group_vars/all b/hadoop/group_vars/all new file mode 100644 index 0000000..019dc13 --- /dev/null +++ b/hadoop/group_vars/all @@ -0,0 +1,56 @@ +# Defaults to the first ethernet interface. Change this to: +# +# iface: eth1 +# +# ...to override. +# +iface: '{{ ansible_default_ipv4.interface }}' + +ha_enabled: False + +hadoop: + +#Variables for - common + + fs_default_FS_port: 8020 + nameservice_id: mycluster4 + +#Variables for + + dfs_permissions_superusergroup: hdfs + dfs_namenode_name_dir: + - /namedir1/ + - /namedir2/ + dfs_replication: 3 + dfs_namenode_handler_count: 50 + dfs_blocksize: 67108864 + dfs_datanode_data_dir: + - /datadir1/ + - /datadir2/ + dfs_datanode_address_port: 50010 + dfs_datanode_http_address_port: 50075 + dfs_datanode_ipc_address_port: 50020 + dfs_namenode_http_address_port: 50070 + dfs_ha_zkfc_port: 8019 + qjournal_port: 8485 + qjournal_http_port: 8480 + dfs_journalnode_edits_dir: /journaldir/ + zookeeper_clientport: 2181 + zookeeper_leader_port: 2888 + zookeeper_election_port: 3888 + +#Variables for - common + mapred_job_tracker_ha_servicename: myjt4 + mapred_job_tracker_http_address_port: 50030 + mapred_task_tracker_http_address_port: 50060 + mapred_job_tracker_port: 8021 + mapred_ha_jobtracker_rpc-address_port: 8023 + mapred_ha_zkfc_port: 8018 + mapred_job_tracker_persist_jobstatus_dir: /jobdir/ + mapred_local_dir: + - /mapred1/ + - /mapred2/ + + + + diff --git a/hadoop/hosts b/hadoop/hosts new file mode 100644 index 0000000..06cc524 --- /dev/null +++ b/hadoop/hosts @@ -0,0 +1,31 @@ +[hadoop_all:children] +hadoop_masters +hadoop_slaves +qjournal_servers +zookeeper_servers + +[hadoop_master_primary] +hadoop1 + +[hadoop_master_secondary] +hadoop2 + +[hadoop_masters:children] +hadoop_master_primary +hadoop_master_secondary + +[hadoop_slaves] +hadoop1 +hadoop2 +hadoop3 + +[qjournal_servers] +hadoop1 +hadoop2 +hadoop3 + +[zookeeper_servers] +hadoop1 zoo_id=1 +hadoop2 zoo_id=2 +hadoop3 zoo_id=3 + diff --git a/hadoop/images/hadoop.png b/hadoop/images/hadoop.png new file mode 100644 index 0000000..37b09e2 Binary files /dev/null and b/hadoop/images/hadoop.png differ diff --git a/hadoop/images/hadoopha.png b/hadoop/images/hadoopha.png new file mode 100644 index 0000000..895b2aa Binary files /dev/null and b/hadoop/images/hadoopha.png differ diff --git a/hadoop/images/hdfs.png b/hadoop/images/hdfs.png new file mode 100644 index 0000000..475b483 Binary files /dev/null and b/hadoop/images/hdfs.png differ diff --git a/hadoop/images/map.png b/hadoop/images/map.png new file mode 100644 index 0000000..448c73d Binary files /dev/null and b/hadoop/images/map.png differ diff --git a/hadoop/images/qjm.png b/hadoop/images/qjm.png new file mode 100644 index 0000000..d2c7896 Binary files /dev/null and b/hadoop/images/qjm.png differ diff --git a/hadoop/images/reduce.png b/hadoop/images/reduce.png new file mode 100644 index 0000000..9a7d1c5 Binary files /dev/null and b/hadoop/images/reduce.png differ diff --git a/hadoop/images/zookeeper.png b/hadoop/images/zookeeper.png new file mode 100644 index 0000000..e9cda52 Binary files /dev/null and b/hadoop/images/zookeeper.png differ diff --git a/hadoop/playbooks/inputfile b/hadoop/playbooks/inputfile new file mode 100644 index 0000000..00dbe3f --- /dev/null +++ b/hadoop/playbooks/inputfile @@ -0,0 +1,19 @@ +asdf +sdf +sdf +sd +f +sf +sdf +sd +fsd +hello +asf +sf +sd +fsd +f +sdf +sd +hello + diff --git a/hadoop/playbooks/job.yml b/hadoop/playbooks/job.yml new file mode 100644 index 0000000..a31e95e --- /dev/null +++ b/hadoop/playbooks/job.yml @@ -0,0 +1,21 @@ +--- +# Launch Job to count occurance of a word. + +- hosts: $server + user: root + tasks: + - name: copy the file + copy: src=inputfile dest=/tmp/inputfile + + - name: upload the file + shell: su - hdfs -c "hadoop fs -put /tmp/inputfile /inputfile" + + - name: Run the MapReduce job to count the occurance of word hello + shell: su - hdfs -c "hadoop jar /usr/lib/hadoop-0.20-mapreduce/hadoop-examples.jar grep /inputfile /outputfile 'hello'" + + - name: Fetch the outputfile to local tmp dir + shell: su - hdfs -c "hadoop fs -get /outputfile /tmp/outputfile" + + - name: Get the outputfile to ansible server + fetch: dest=/tmp src=/tmp/outputfile/part-00000 + diff --git a/hadoop/roles/common/files/etc/cloudera-CDH4.repo b/hadoop/roles/common/files/etc/cloudera-CDH4.repo new file mode 100644 index 0000000..249f664 --- /dev/null +++ b/hadoop/roles/common/files/etc/cloudera-CDH4.repo @@ -0,0 +1,5 @@ +[cloudera-cdh4] +name=Cloudera's Distribution for Hadoop, Version 4 +baseurl=http://archive.cloudera.com/cdh4/redhat/6/x86_64/cdh/4/ +gpgkey = http://archive.cloudera.com/cdh4/redhat/6/x86_64/cdh/RPM-GPG-KEY-cloudera +gpgcheck = 1 diff --git a/hadoop/roles/common/handlers/main.yml b/hadoop/roles/common/handlers/main.yml new file mode 100644 index 0000000..8860ce2 --- /dev/null +++ b/hadoop/roles/common/handlers/main.yml @@ -0,0 +1,2 @@ +- name: restart iptables + service: name=iptables state=restarted diff --git a/hadoop/roles/common/tasks/common.yml b/hadoop/roles/common/tasks/common.yml new file mode 100644 index 0000000..d5f089e --- /dev/null +++ b/hadoop/roles/common/tasks/common.yml @@ -0,0 +1,28 @@ +--- +# The playbook for common tasks + +- name: Deploy the Cloudera Repository + copy: src=etc/cloudera-CDH4.repo dest=/etc/yum.repos.d/cloudera-CDH4.repo + +- name: Install the openjdk package + yum: name=java-1.6.0-openjdk state=installed + +- name: Create a directory for java + file: state=directory path=/usr/java/ + +- name: Create a link for java + file: src=/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre state=link path=/usr/java/default + +- name: Create the hosts file for all machines + template: src=etc/hosts.j2 dest=/etc/hosts + +- name: Disable SELinux in conf file + lineinfile: dest=/etc/sysconfig/selinux regexp='^SELINUX=' line='SELINUX=disabled' state=present + +- name: Disable SELinux dynamically + shell: creates=/etc/sysconfig/selinux.disabled setenforce 0 ; touch /etc/sysconfig/selinux.disabled + +- name: Create the iptables file for all machines + template: src=iptables.j2 dest=/etc/sysconfig/iptables + notify: restart iptables + diff --git a/hadoop/roles/common/tasks/main.yml b/hadoop/roles/common/tasks/main.yml new file mode 100644 index 0000000..ef6677f --- /dev/null +++ b/hadoop/roles/common/tasks/main.yml @@ -0,0 +1,5 @@ +--- +# The playbook for common tasks + +- include: common.yml tags=slaves + diff --git a/hadoop/roles/common/templates/etc/hosts.j2 b/hadoop/roles/common/templates/etc/hosts.j2 new file mode 100644 index 0000000..df5ed34 --- /dev/null +++ b/hadoop/roles/common/templates/etc/hosts.j2 @@ -0,0 +1,5 @@ +127.0.0.1 localhost +{% for host in groups.all %} +{{ hostvars[host]['ansible_' + iface].ipv4.address }} {{ host }} +{% endfor %} + diff --git a/hadoop/roles/common/templates/hadoop_conf/core-site.xml.j2 b/hadoop/roles/common/templates/hadoop_conf/core-site.xml.j2 new file mode 100644 index 0000000..75ac8f6 --- /dev/null +++ b/hadoop/roles/common/templates/hadoop_conf/core-site.xml.j2 @@ -0,0 +1,25 @@ + + + + + + + fs.defaultFS + hdfs://{{ hostvars[groups['hadoop_masters'][0]]['ansible_hostname'] + ':' ~ hadoop['fs_default_FS_port'] }}/ + + diff --git a/hadoop/roles/common/templates/hadoop_conf/hadoop-metrics.properties.j2 b/hadoop/roles/common/templates/hadoop_conf/hadoop-metrics.properties.j2 new file mode 100644 index 0000000..c1b2eb7 --- /dev/null +++ b/hadoop/roles/common/templates/hadoop_conf/hadoop-metrics.properties.j2 @@ -0,0 +1,75 @@ +# Configuration of the "dfs" context for null +dfs.class=org.apache.hadoop.metrics.spi.NullContext + +# Configuration of the "dfs" context for file +#dfs.class=org.apache.hadoop.metrics.file.FileContext +#dfs.period=10 +#dfs.fileName=/tmp/dfsmetrics.log + +# Configuration of the "dfs" context for ganglia +# Pick one: Ganglia 3.0 (former) or Ganglia 3.1 (latter) +# dfs.class=org.apache.hadoop.metrics.ganglia.GangliaContext +# dfs.class=org.apache.hadoop.metrics.ganglia.GangliaContext31 +# dfs.period=10 +# dfs.servers=localhost:8649 + + +# Configuration of the "mapred" context for null +mapred.class=org.apache.hadoop.metrics.spi.NullContext + +# Configuration of the "mapred" context for file +#mapred.class=org.apache.hadoop.metrics.file.FileContext +#mapred.period=10 +#mapred.fileName=/tmp/mrmetrics.log + +# Configuration of the "mapred" context for ganglia +# Pick one: Ganglia 3.0 (former) or Ganglia 3.1 (latter) +# mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext +# mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext31 +# mapred.period=10 +# mapred.servers=localhost:8649 + + +# Configuration of the "jvm" context for null +#jvm.class=org.apache.hadoop.metrics.spi.NullContext + +# Configuration of the "jvm" context for file +#jvm.class=org.apache.hadoop.metrics.file.FileContext +#jvm.period=10 +#jvm.fileName=/tmp/jvmmetrics.log + +# Configuration of the "jvm" context for ganglia +# jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext +# jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext31 +# jvm.period=10 +# jvm.servers=localhost:8649 + +# Configuration of the "rpc" context for null +rpc.class=org.apache.hadoop.metrics.spi.NullContext + +# Configuration of the "rpc" context for file +#rpc.class=org.apache.hadoop.metrics.file.FileContext +#rpc.period=10 +#rpc.fileName=/tmp/rpcmetrics.log + +# Configuration of the "rpc" context for ganglia +# rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContext +# rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContext31 +# rpc.period=10 +# rpc.servers=localhost:8649 + + +# Configuration of the "ugi" context for null +ugi.class=org.apache.hadoop.metrics.spi.NullContext + +# Configuration of the "ugi" context for file +#ugi.class=org.apache.hadoop.metrics.file.FileContext +#ugi.period=10 +#ugi.fileName=/tmp/ugimetrics.log + +# Configuration of the "ugi" context for ganglia +# ugi.class=org.apache.hadoop.metrics.ganglia.GangliaContext +# ugi.class=org.apache.hadoop.metrics.ganglia.GangliaContext31 +# ugi.period=10 +# ugi.servers=localhost:8649 + diff --git a/hadoop/roles/common/templates/hadoop_conf/hadoop-metrics2.properties.j2 b/hadoop/roles/common/templates/hadoop_conf/hadoop-metrics2.properties.j2 new file mode 100644 index 0000000..c3ffe31 --- /dev/null +++ b/hadoop/roles/common/templates/hadoop_conf/hadoop-metrics2.properties.j2 @@ -0,0 +1,44 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +# syntax: [prefix].[source|sink].[instance].[options] +# See javadoc of package-info.java for org.apache.hadoop.metrics2 for details + +*.sink.file.class=org.apache.hadoop.metrics2.sink.FileSink +# default sampling period, in seconds +*.period=10 + +# The namenode-metrics.out will contain metrics from all context +#namenode.sink.file.filename=namenode-metrics.out +# Specifying a special sampling period for namenode: +#namenode.sink.*.period=8 + +#datanode.sink.file.filename=datanode-metrics.out + +# the following example split metrics of different +# context to different sinks (in this case files) +#jobtracker.sink.file_jvm.context=jvm +#jobtracker.sink.file_jvm.filename=jobtracker-jvm-metrics.out +#jobtracker.sink.file_mapred.context=mapred +#jobtracker.sink.file_mapred.filename=jobtracker-mapred-metrics.out + +#tasktracker.sink.file.filename=tasktracker-metrics.out + +#maptask.sink.file.filename=maptask-metrics.out + +#reducetask.sink.file.filename=reducetask-metrics.out + diff --git a/hadoop/roles/common/templates/hadoop_conf/hdfs-site.xml.j2 b/hadoop/roles/common/templates/hadoop_conf/hdfs-site.xml.j2 new file mode 100644 index 0000000..0c537fc --- /dev/null +++ b/hadoop/roles/common/templates/hadoop_conf/hdfs-site.xml.j2 @@ -0,0 +1,57 @@ + + + + + + + dfs.blocksize + {{ hadoop['dfs_blocksize'] }} + + + dfs.permissions.superusergroup + {{ hadoop['dfs_permissions_superusergroup'] }} + + + dfs.namenode.http.address + 0.0.0.0:{{ hadoop['dfs_namenode_http_address_port'] }} + + + dfs.datanode.address + 0.0.0.0:{{ hadoop['dfs_datanode_address_port'] }} + + + dfs.datanode.http.address + 0.0.0.0:{{ hadoop['dfs_datanode_http_address_port'] }} + + + dfs.datanode.ipc.address + 0.0.0.0:{{ hadoop['dfs_datanode_ipc_address_port'] }} + + + dfs.replication + {{ hadoop['dfs_replication'] }} + + + dfs.namenode.name.dir + {{ hadoop['dfs_namenode_name_dir'] | join(',') }} + + + dfs.datanode.data.dir + {{ hadoop['dfs_datanode_data_dir'] | join(',') }} + + diff --git a/hadoop/roles/common/templates/hadoop_conf/log4j.properties.j2 b/hadoop/roles/common/templates/hadoop_conf/log4j.properties.j2 new file mode 100644 index 0000000..b92ad27 --- /dev/null +++ b/hadoop/roles/common/templates/hadoop_conf/log4j.properties.j2 @@ -0,0 +1,219 @@ +# Copyright 2011 The Apache Software Foundation +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# Define some default values that can be overridden by system properties +hadoop.root.logger=INFO,console +hadoop.log.dir=. +hadoop.log.file=hadoop.log + +# Define the root logger to the system property "hadoop.root.logger". +log4j.rootLogger=${hadoop.root.logger}, EventCounter + +# Logging Threshold +log4j.threshold=ALL + +# Null Appender +log4j.appender.NullAppender=org.apache.log4j.varia.NullAppender + +# +# Rolling File Appender - cap space usage at 5gb. +# +hadoop.log.maxfilesize=256MB +hadoop.log.maxbackupindex=20 +log4j.appender.RFA=org.apache.log4j.RollingFileAppender +log4j.appender.RFA.File=${hadoop.log.dir}/${hadoop.log.file} + +log4j.appender.RFA.MaxFileSize=${hadoop.log.maxfilesize} +log4j.appender.RFA.MaxBackupIndex=${hadoop.log.maxbackupindex} + +log4j.appender.RFA.layout=org.apache.log4j.PatternLayout + +# Pattern format: Date LogLevel LoggerName LogMessage +log4j.appender.RFA.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n +# Debugging Pattern format +#log4j.appender.RFA.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} (%F:%M(%L)) - %m%n + + +# +# Daily Rolling File Appender +# + +log4j.appender.DRFA=org.apache.log4j.DailyRollingFileAppender +log4j.appender.DRFA.File=${hadoop.log.dir}/${hadoop.log.file} + +# Rollver at midnight +log4j.appender.DRFA.DatePattern=.yyyy-MM-dd + +# 30-day backup +#log4j.appender.DRFA.MaxBackupIndex=30 +log4j.appender.DRFA.layout=org.apache.log4j.PatternLayout + +# Pattern format: Date LogLevel LoggerName LogMessage +log4j.appender.DRFA.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n +# Debugging Pattern format +#log4j.appender.DRFA.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} (%F:%M(%L)) - %m%n + + +# +# console +# Add "console" to rootlogger above if you want to use this +# + +log4j.appender.console=org.apache.log4j.ConsoleAppender +log4j.appender.console.target=System.err +log4j.appender.console.layout=org.apache.log4j.PatternLayout +log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{2}: %m%n + +# +# TaskLog Appender +# + +#Default values +hadoop.tasklog.taskid=null +hadoop.tasklog.iscleanup=false +hadoop.tasklog.noKeepSplits=4 +hadoop.tasklog.totalLogFileSize=100 +hadoop.tasklog.purgeLogSplits=true +hadoop.tasklog.logsRetainHours=12 + +log4j.appender.TLA=org.apache.hadoop.mapred.TaskLogAppender +log4j.appender.TLA.taskId=${hadoop.tasklog.taskid} +log4j.appender.TLA.isCleanup=${hadoop.tasklog.iscleanup} +log4j.appender.TLA.totalLogFileSize=${hadoop.tasklog.totalLogFileSize} + +log4j.appender.TLA.layout=org.apache.log4j.PatternLayout +log4j.appender.TLA.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n + +# +# HDFS block state change log from block manager +# +# Uncomment the following to suppress normal block state change +# messages from BlockManager in NameNode. +#log4j.logger.BlockStateChange=WARN + +# +#Security appender +# +hadoop.security.logger=INFO,NullAppender +hadoop.security.log.maxfilesize=256MB +hadoop.security.log.maxbackupindex=20 +log4j.category.SecurityLogger=${hadoop.security.logger} +hadoop.security.log.file=SecurityAuth-${user.name}.audit +log4j.appender.RFAS=org.apache.log4j.RollingFileAppender +log4j.appender.RFAS.File=${hadoop.log.dir}/${hadoop.security.log.file} +log4j.appender.RFAS.layout=org.apache.log4j.PatternLayout +log4j.appender.RFAS.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n +log4j.appender.RFAS.MaxFileSize=${hadoop.security.log.maxfilesize} +log4j.appender.RFAS.MaxBackupIndex=${hadoop.security.log.maxbackupindex} + +# +# Daily Rolling Security appender +# +log4j.appender.DRFAS=org.apache.log4j.DailyRollingFileAppender +log4j.appender.DRFAS.File=${hadoop.log.dir}/${hadoop.security.log.file} +log4j.appender.DRFAS.layout=org.apache.log4j.PatternLayout +log4j.appender.DRFAS.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n +log4j.appender.DRFAS.DatePattern=.yyyy-MM-dd + +# +# hdfs audit logging +# +hdfs.audit.logger=INFO,NullAppender +hdfs.audit.log.maxfilesize=256MB +hdfs.audit.log.maxbackupindex=20 +log4j.logger.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=${hdfs.audit.logger} +log4j.additivity.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=false +log4j.appender.RFAAUDIT=org.apache.log4j.RollingFileAppender +log4j.appender.RFAAUDIT.File=${hadoop.log.dir}/hdfs-audit.log +log4j.appender.RFAAUDIT.layout=org.apache.log4j.PatternLayout +log4j.appender.RFAAUDIT.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n +log4j.appender.RFAAUDIT.MaxFileSize=${hdfs.audit.log.maxfilesize} +log4j.appender.RFAAUDIT.MaxBackupIndex=${hdfs.audit.log.maxbackupindex} + +# +# mapred audit logging +# +mapred.audit.logger=INFO,NullAppender +mapred.audit.log.maxfilesize=256MB +mapred.audit.log.maxbackupindex=20 +log4j.logger.org.apache.hadoop.mapred.AuditLogger=${mapred.audit.logger} +log4j.additivity.org.apache.hadoop.mapred.AuditLogger=false +log4j.appender.MRAUDIT=org.apache.log4j.RollingFileAppender +log4j.appender.MRAUDIT.File=${hadoop.log.dir}/mapred-audit.log +log4j.appender.MRAUDIT.layout=org.apache.log4j.PatternLayout +log4j.appender.MRAUDIT.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n +log4j.appender.MRAUDIT.MaxFileSize=${mapred.audit.log.maxfilesize} +log4j.appender.MRAUDIT.MaxBackupIndex=${mapred.audit.log.maxbackupindex} + +# Custom Logging levels + +#log4j.logger.org.apache.hadoop.mapred.JobTracker=DEBUG +#log4j.logger.org.apache.hadoop.mapred.TaskTracker=DEBUG +#log4j.logger.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=DEBUG + +# Jets3t library +log4j.logger.org.jets3t.service.impl.rest.httpclient.RestS3Service=ERROR + +# +# Event Counter Appender +# Sends counts of logging messages at different severity levels to Hadoop Metrics. +# +log4j.appender.EventCounter=org.apache.hadoop.log.metrics.EventCounter + +# +# Job Summary Appender +# +# Use following logger to send summary to separate file defined by +# hadoop.mapreduce.jobsummary.log.file : +# hadoop.mapreduce.jobsummary.logger=INFO,JSA +# +hadoop.mapreduce.jobsummary.logger=${hadoop.root.logger} +hadoop.mapreduce.jobsummary.log.file=hadoop-mapreduce.jobsummary.log +hadoop.mapreduce.jobsummary.log.maxfilesize=256MB +hadoop.mapreduce.jobsummary.log.maxbackupindex=20 +log4j.appender.JSA=org.apache.log4j.RollingFileAppender +log4j.appender.JSA.File=${hadoop.log.dir}/${hadoop.mapreduce.jobsummary.log.file} +log4j.appender.JSA.MaxFileSize=${hadoop.mapreduce.jobsummary.log.maxfilesize} +log4j.appender.JSA.MaxBackupIndex=${hadoop.mapreduce.jobsummary.log.maxbackupindex} +log4j.appender.JSA.layout=org.apache.log4j.PatternLayout +log4j.appender.JSA.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{2}: %m%n +log4j.logger.org.apache.hadoop.mapred.JobInProgress$JobSummary=${hadoop.mapreduce.jobsummary.logger} +log4j.additivity.org.apache.hadoop.mapred.JobInProgress$JobSummary=false + +# +# Yarn ResourceManager Application Summary Log +# +# Set the ResourceManager summary log filename +#yarn.server.resourcemanager.appsummary.log.file=rm-appsummary.log +# Set the ResourceManager summary log level and appender +#yarn.server.resourcemanager.appsummary.logger=INFO,RMSUMMARY + +# Appender for ResourceManager Application Summary Log +# Requires the following properties to be set +# - hadoop.log.dir (Hadoop Log directory) +# - yarn.server.resourcemanager.appsummary.log.file (resource manager app summary log filename) +# - yarn.server.resourcemanager.appsummary.logger (resource manager app summary log level and appender) + +#log4j.logger.org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary=${yarn.server.resourcemanager.appsummary.logger} +#log4j.additivity.org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary=false +#log4j.appender.RMSUMMARY=org.apache.log4j.RollingFileAppender +#log4j.appender.RMSUMMARY.File=${hadoop.log.dir}/${yarn.server.resourcemanager.appsummary.log.file} +#log4j.appender.RMSUMMARY.MaxFileSize=256MB +#log4j.appender.RMSUMMARY.MaxBackupIndex=20 +#log4j.appender.RMSUMMARY.layout=org.apache.log4j.PatternLayout +#log4j.appender.RMSUMMARY.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n diff --git a/hadoop/roles/common/templates/hadoop_conf/mapred-site.xml.j2 b/hadoop/roles/common/templates/hadoop_conf/mapred-site.xml.j2 new file mode 100644 index 0000000..0941698 --- /dev/null +++ b/hadoop/roles/common/templates/hadoop_conf/mapred-site.xml.j2 @@ -0,0 +1,22 @@ + + + + mapred.job.tracker + {{ hostvars[groups['hadoop_masters'][0]]['ansible_hostname'] }}:{{ hadoop['mapred_job_tracker_port'] }} + + + + mapred.local.dir + {{ hadoop["mapred_local_dir"] | join(',') }} + + + + mapred.task.tracker.http.address + 0.0.0.0:{{ hadoop['mapred_task_tracker_http_address_port'] }} + + + mapred.job.tracker.http.address + 0.0.0.0:{{ hadoop['mapred_job_tracker_http_address_port'] }} + + + diff --git a/hadoop/roles/common/templates/hadoop_conf/slaves.j2 b/hadoop/roles/common/templates/hadoop_conf/slaves.j2 new file mode 100644 index 0000000..44f97e2 --- /dev/null +++ b/hadoop/roles/common/templates/hadoop_conf/slaves.j2 @@ -0,0 +1,3 @@ +{% for host in groups['hadoop_slaves'] %} +{{ host }} +{% endfor %} diff --git a/hadoop/roles/common/templates/hadoop_conf/ssl-client.xml.example.j2 b/hadoop/roles/common/templates/hadoop_conf/ssl-client.xml.example.j2 new file mode 100644 index 0000000..a50dce4 --- /dev/null +++ b/hadoop/roles/common/templates/hadoop_conf/ssl-client.xml.example.j2 @@ -0,0 +1,80 @@ + + + + + + + ssl.client.truststore.location + + Truststore to be used by clients like distcp. Must be + specified. + + + + + ssl.client.truststore.password + + Optional. Default value is "". + + + + + ssl.client.truststore.type + jks + Optional. The keystore file format, default value is "jks". + + + + + ssl.client.truststore.reload.interval + 10000 + Truststore reload check interval, in milliseconds. + Default value is 10000 (10 seconds). + + + + + ssl.client.keystore.location + + Keystore to be used by clients like distcp. Must be + specified. + + + + + ssl.client.keystore.password + + Optional. Default value is "". + + + + + ssl.client.keystore.keypassword + + Optional. Default value is "". + + + + + ssl.client.keystore.type + jks + Optional. The keystore file format, default value is "jks". + + + + diff --git a/hadoop/roles/common/templates/hadoop_conf/ssl-server.xml.example.j2 b/hadoop/roles/common/templates/hadoop_conf/ssl-server.xml.example.j2 new file mode 100644 index 0000000..4b363ff --- /dev/null +++ b/hadoop/roles/common/templates/hadoop_conf/ssl-server.xml.example.j2 @@ -0,0 +1,77 @@ + + + + + + + ssl.server.truststore.location + + Truststore to be used by NN and DN. Must be specified. + + + + + ssl.server.truststore.password + + Optional. Default value is "". + + + + + ssl.server.truststore.type + jks + Optional. The keystore file format, default value is "jks". + + + + + ssl.server.truststore.reload.interval + 10000 + Truststore reload check interval, in milliseconds. + Default value is 10000 (10 seconds). + + + + ssl.server.keystore.location + + Keystore to be used by NN and DN. Must be specified. + + + + + ssl.server.keystore.password + + Must be specified. + + + + + ssl.server.keystore.keypassword + + Must be specified. + + + + + ssl.server.keystore.type + jks + Optional. The keystore file format, default value is "jks". + + + + diff --git a/hadoop/roles/common/templates/hadoop_ha_conf/core-site.xml.j2 b/hadoop/roles/common/templates/hadoop_ha_conf/core-site.xml.j2 new file mode 100644 index 0000000..62f355d --- /dev/null +++ b/hadoop/roles/common/templates/hadoop_ha_conf/core-site.xml.j2 @@ -0,0 +1,25 @@ + + + + + + + fs.defaultFS + hdfs://{{ hadoop['nameservice_id'] }}/ + + diff --git a/hadoop/roles/common/templates/hadoop_ha_conf/hadoop-metrics.properties.j2 b/hadoop/roles/common/templates/hadoop_ha_conf/hadoop-metrics.properties.j2 new file mode 100644 index 0000000..c1b2eb7 --- /dev/null +++ b/hadoop/roles/common/templates/hadoop_ha_conf/hadoop-metrics.properties.j2 @@ -0,0 +1,75 @@ +# Configuration of the "dfs" context for null +dfs.class=org.apache.hadoop.metrics.spi.NullContext + +# Configuration of the "dfs" context for file +#dfs.class=org.apache.hadoop.metrics.file.FileContext +#dfs.period=10 +#dfs.fileName=/tmp/dfsmetrics.log + +# Configuration of the "dfs" context for ganglia +# Pick one: Ganglia 3.0 (former) or Ganglia 3.1 (latter) +# dfs.class=org.apache.hadoop.metrics.ganglia.GangliaContext +# dfs.class=org.apache.hadoop.metrics.ganglia.GangliaContext31 +# dfs.period=10 +# dfs.servers=localhost:8649 + + +# Configuration of the "mapred" context for null +mapred.class=org.apache.hadoop.metrics.spi.NullContext + +# Configuration of the "mapred" context for file +#mapred.class=org.apache.hadoop.metrics.file.FileContext +#mapred.period=10 +#mapred.fileName=/tmp/mrmetrics.log + +# Configuration of the "mapred" context for ganglia +# Pick one: Ganglia 3.0 (former) or Ganglia 3.1 (latter) +# mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext +# mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext31 +# mapred.period=10 +# mapred.servers=localhost:8649 + + +# Configuration of the "jvm" context for null +#jvm.class=org.apache.hadoop.metrics.spi.NullContext + +# Configuration of the "jvm" context for file +#jvm.class=org.apache.hadoop.metrics.file.FileContext +#jvm.period=10 +#jvm.fileName=/tmp/jvmmetrics.log + +# Configuration of the "jvm" context for ganglia +# jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext +# jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext31 +# jvm.period=10 +# jvm.servers=localhost:8649 + +# Configuration of the "rpc" context for null +rpc.class=org.apache.hadoop.metrics.spi.NullContext + +# Configuration of the "rpc" context for file +#rpc.class=org.apache.hadoop.metrics.file.FileContext +#rpc.period=10 +#rpc.fileName=/tmp/rpcmetrics.log + +# Configuration of the "rpc" context for ganglia +# rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContext +# rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContext31 +# rpc.period=10 +# rpc.servers=localhost:8649 + + +# Configuration of the "ugi" context for null +ugi.class=org.apache.hadoop.metrics.spi.NullContext + +# Configuration of the "ugi" context for file +#ugi.class=org.apache.hadoop.metrics.file.FileContext +#ugi.period=10 +#ugi.fileName=/tmp/ugimetrics.log + +# Configuration of the "ugi" context for ganglia +# ugi.class=org.apache.hadoop.metrics.ganglia.GangliaContext +# ugi.class=org.apache.hadoop.metrics.ganglia.GangliaContext31 +# ugi.period=10 +# ugi.servers=localhost:8649 + diff --git a/hadoop/roles/common/templates/hadoop_ha_conf/hadoop-metrics2.properties.j2 b/hadoop/roles/common/templates/hadoop_ha_conf/hadoop-metrics2.properties.j2 new file mode 100644 index 0000000..c3ffe31 --- /dev/null +++ b/hadoop/roles/common/templates/hadoop_ha_conf/hadoop-metrics2.properties.j2 @@ -0,0 +1,44 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +# syntax: [prefix].[source|sink].[instance].[options] +# See javadoc of package-info.java for org.apache.hadoop.metrics2 for details + +*.sink.file.class=org.apache.hadoop.metrics2.sink.FileSink +# default sampling period, in seconds +*.period=10 + +# The namenode-metrics.out will contain metrics from all context +#namenode.sink.file.filename=namenode-metrics.out +# Specifying a special sampling period for namenode: +#namenode.sink.*.period=8 + +#datanode.sink.file.filename=datanode-metrics.out + +# the following example split metrics of different +# context to different sinks (in this case files) +#jobtracker.sink.file_jvm.context=jvm +#jobtracker.sink.file_jvm.filename=jobtracker-jvm-metrics.out +#jobtracker.sink.file_mapred.context=mapred +#jobtracker.sink.file_mapred.filename=jobtracker-mapred-metrics.out + +#tasktracker.sink.file.filename=tasktracker-metrics.out + +#maptask.sink.file.filename=maptask-metrics.out + +#reducetask.sink.file.filename=reducetask-metrics.out + diff --git a/hadoop/roles/common/templates/hadoop_ha_conf/hdfs-site.xml.j2 b/hadoop/roles/common/templates/hadoop_ha_conf/hdfs-site.xml.j2 new file mode 100644 index 0000000..7dadd91 --- /dev/null +++ b/hadoop/roles/common/templates/hadoop_ha_conf/hdfs-site.xml.j2 @@ -0,0 +1,103 @@ + + + + + + dfs.nameservices + {{ hadoop['nameservice_id'] }} + + + dfs.ha.namenodes.{{ hadoop['nameservice_id'] }} + {{ groups.hadoop_masters | join(',') }} + + + dfs.blocksize + {{ hadoop['dfs_blocksize'] }} + + + dfs.permissions.superusergroup + {{ hadoop['dfs_permissions_superusergroup'] }} + + + dfs.ha.automatic-failover.enabled + true + + + ha.zookeeper.quorum + {{ groups.zookeeper_servers | join(':' ~ hadoop['zookeeper_clientport'] + ',') }}:{{ hadoop['zookeeper_clientport'] }} + + +{% for host in groups['hadoop_masters'] %} + + dfs.namenode.rpc-address.{{ hadoop['nameservice_id'] }}.{{ host }} + {{ host }}:{{ hadoop['fs_default_FS_port'] }} + +{% endfor %} +{% for host in groups['hadoop_masters'] %} + + dfs.namenode.http-address.{{ hadoop['nameservice_id'] }}.{{ host }} + {{ host }}:{{ hadoop['dfs_namenode_http_address_port'] }} + +{% endfor %} + + dfs.namenode.shared.edits.dir + qjournal://{{ groups.qjournal_servers | join(':' ~ hadoop['qjournal_port'] + ';') }}:{{ hadoop['qjournal_port'] }}/{{ hadoop['nameservice_id'] }} + + + dfs.journalnode.edits.dir + {{ hadoop['dfs_journalnode_edits_dir'] }} + + + dfs.client.failover.proxy.provider.{{ hadoop['nameservice_id'] }} + org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider + + + dfs.ha.fencing.methods + shell(/bin/true ) + + + + dfs.ha.zkfc.port + {{ hadoop['dfs_ha_zkfc_port'] }} + + + + dfs.datanode.address + 0.0.0.0:{{ hadoop['dfs_datanode_address_port'] }} + + + dfs.datanode.http.address + 0.0.0.0:{{ hadoop['dfs_datanode_http_address_port'] }} + + + dfs.datanode.ipc.address + 0.0.0.0:{{ hadoop['dfs_datanode_ipc_address_port'] }} + + + dfs.replication + {{ hadoop['dfs_replication'] }} + + + dfs.namenode.name.dir + {{ hadoop['dfs_namenode_name_dir'] | join(',') }} + + + dfs.datanode.data.dir + {{ hadoop['dfs_datanode_data_dir'] | join(',') }} + + diff --git a/hadoop/roles/common/templates/hadoop_ha_conf/log4j.properties.j2 b/hadoop/roles/common/templates/hadoop_ha_conf/log4j.properties.j2 new file mode 100644 index 0000000..b92ad27 --- /dev/null +++ b/hadoop/roles/common/templates/hadoop_ha_conf/log4j.properties.j2 @@ -0,0 +1,219 @@ +# Copyright 2011 The Apache Software Foundation +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +# Define some default values that can be overridden by system properties +hadoop.root.logger=INFO,console +hadoop.log.dir=. +hadoop.log.file=hadoop.log + +# Define the root logger to the system property "hadoop.root.logger". +log4j.rootLogger=${hadoop.root.logger}, EventCounter + +# Logging Threshold +log4j.threshold=ALL + +# Null Appender +log4j.appender.NullAppender=org.apache.log4j.varia.NullAppender + +# +# Rolling File Appender - cap space usage at 5gb. +# +hadoop.log.maxfilesize=256MB +hadoop.log.maxbackupindex=20 +log4j.appender.RFA=org.apache.log4j.RollingFileAppender +log4j.appender.RFA.File=${hadoop.log.dir}/${hadoop.log.file} + +log4j.appender.RFA.MaxFileSize=${hadoop.log.maxfilesize} +log4j.appender.RFA.MaxBackupIndex=${hadoop.log.maxbackupindex} + +log4j.appender.RFA.layout=org.apache.log4j.PatternLayout + +# Pattern format: Date LogLevel LoggerName LogMessage +log4j.appender.RFA.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n +# Debugging Pattern format +#log4j.appender.RFA.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} (%F:%M(%L)) - %m%n + + +# +# Daily Rolling File Appender +# + +log4j.appender.DRFA=org.apache.log4j.DailyRollingFileAppender +log4j.appender.DRFA.File=${hadoop.log.dir}/${hadoop.log.file} + +# Rollver at midnight +log4j.appender.DRFA.DatePattern=.yyyy-MM-dd + +# 30-day backup +#log4j.appender.DRFA.MaxBackupIndex=30 +log4j.appender.DRFA.layout=org.apache.log4j.PatternLayout + +# Pattern format: Date LogLevel LoggerName LogMessage +log4j.appender.DRFA.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n +# Debugging Pattern format +#log4j.appender.DRFA.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} (%F:%M(%L)) - %m%n + + +# +# console +# Add "console" to rootlogger above if you want to use this +# + +log4j.appender.console=org.apache.log4j.ConsoleAppender +log4j.appender.console.target=System.err +log4j.appender.console.layout=org.apache.log4j.PatternLayout +log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{2}: %m%n + +# +# TaskLog Appender +# + +#Default values +hadoop.tasklog.taskid=null +hadoop.tasklog.iscleanup=false +hadoop.tasklog.noKeepSplits=4 +hadoop.tasklog.totalLogFileSize=100 +hadoop.tasklog.purgeLogSplits=true +hadoop.tasklog.logsRetainHours=12 + +log4j.appender.TLA=org.apache.hadoop.mapred.TaskLogAppender +log4j.appender.TLA.taskId=${hadoop.tasklog.taskid} +log4j.appender.TLA.isCleanup=${hadoop.tasklog.iscleanup} +log4j.appender.TLA.totalLogFileSize=${hadoop.tasklog.totalLogFileSize} + +log4j.appender.TLA.layout=org.apache.log4j.PatternLayout +log4j.appender.TLA.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n + +# +# HDFS block state change log from block manager +# +# Uncomment the following to suppress normal block state change +# messages from BlockManager in NameNode. +#log4j.logger.BlockStateChange=WARN + +# +#Security appender +# +hadoop.security.logger=INFO,NullAppender +hadoop.security.log.maxfilesize=256MB +hadoop.security.log.maxbackupindex=20 +log4j.category.SecurityLogger=${hadoop.security.logger} +hadoop.security.log.file=SecurityAuth-${user.name}.audit +log4j.appender.RFAS=org.apache.log4j.RollingFileAppender +log4j.appender.RFAS.File=${hadoop.log.dir}/${hadoop.security.log.file} +log4j.appender.RFAS.layout=org.apache.log4j.PatternLayout +log4j.appender.RFAS.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n +log4j.appender.RFAS.MaxFileSize=${hadoop.security.log.maxfilesize} +log4j.appender.RFAS.MaxBackupIndex=${hadoop.security.log.maxbackupindex} + +# +# Daily Rolling Security appender +# +log4j.appender.DRFAS=org.apache.log4j.DailyRollingFileAppender +log4j.appender.DRFAS.File=${hadoop.log.dir}/${hadoop.security.log.file} +log4j.appender.DRFAS.layout=org.apache.log4j.PatternLayout +log4j.appender.DRFAS.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n +log4j.appender.DRFAS.DatePattern=.yyyy-MM-dd + +# +# hdfs audit logging +# +hdfs.audit.logger=INFO,NullAppender +hdfs.audit.log.maxfilesize=256MB +hdfs.audit.log.maxbackupindex=20 +log4j.logger.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=${hdfs.audit.logger} +log4j.additivity.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=false +log4j.appender.RFAAUDIT=org.apache.log4j.RollingFileAppender +log4j.appender.RFAAUDIT.File=${hadoop.log.dir}/hdfs-audit.log +log4j.appender.RFAAUDIT.layout=org.apache.log4j.PatternLayout +log4j.appender.RFAAUDIT.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n +log4j.appender.RFAAUDIT.MaxFileSize=${hdfs.audit.log.maxfilesize} +log4j.appender.RFAAUDIT.MaxBackupIndex=${hdfs.audit.log.maxbackupindex} + +# +# mapred audit logging +# +mapred.audit.logger=INFO,NullAppender +mapred.audit.log.maxfilesize=256MB +mapred.audit.log.maxbackupindex=20 +log4j.logger.org.apache.hadoop.mapred.AuditLogger=${mapred.audit.logger} +log4j.additivity.org.apache.hadoop.mapred.AuditLogger=false +log4j.appender.MRAUDIT=org.apache.log4j.RollingFileAppender +log4j.appender.MRAUDIT.File=${hadoop.log.dir}/mapred-audit.log +log4j.appender.MRAUDIT.layout=org.apache.log4j.PatternLayout +log4j.appender.MRAUDIT.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n +log4j.appender.MRAUDIT.MaxFileSize=${mapred.audit.log.maxfilesize} +log4j.appender.MRAUDIT.MaxBackupIndex=${mapred.audit.log.maxbackupindex} + +# Custom Logging levels + +#log4j.logger.org.apache.hadoop.mapred.JobTracker=DEBUG +#log4j.logger.org.apache.hadoop.mapred.TaskTracker=DEBUG +#log4j.logger.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=DEBUG + +# Jets3t library +log4j.logger.org.jets3t.service.impl.rest.httpclient.RestS3Service=ERROR + +# +# Event Counter Appender +# Sends counts of logging messages at different severity levels to Hadoop Metrics. +# +log4j.appender.EventCounter=org.apache.hadoop.log.metrics.EventCounter + +# +# Job Summary Appender +# +# Use following logger to send summary to separate file defined by +# hadoop.mapreduce.jobsummary.log.file : +# hadoop.mapreduce.jobsummary.logger=INFO,JSA +# +hadoop.mapreduce.jobsummary.logger=${hadoop.root.logger} +hadoop.mapreduce.jobsummary.log.file=hadoop-mapreduce.jobsummary.log +hadoop.mapreduce.jobsummary.log.maxfilesize=256MB +hadoop.mapreduce.jobsummary.log.maxbackupindex=20 +log4j.appender.JSA=org.apache.log4j.RollingFileAppender +log4j.appender.JSA.File=${hadoop.log.dir}/${hadoop.mapreduce.jobsummary.log.file} +log4j.appender.JSA.MaxFileSize=${hadoop.mapreduce.jobsummary.log.maxfilesize} +log4j.appender.JSA.MaxBackupIndex=${hadoop.mapreduce.jobsummary.log.maxbackupindex} +log4j.appender.JSA.layout=org.apache.log4j.PatternLayout +log4j.appender.JSA.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{2}: %m%n +log4j.logger.org.apache.hadoop.mapred.JobInProgress$JobSummary=${hadoop.mapreduce.jobsummary.logger} +log4j.additivity.org.apache.hadoop.mapred.JobInProgress$JobSummary=false + +# +# Yarn ResourceManager Application Summary Log +# +# Set the ResourceManager summary log filename +#yarn.server.resourcemanager.appsummary.log.file=rm-appsummary.log +# Set the ResourceManager summary log level and appender +#yarn.server.resourcemanager.appsummary.logger=INFO,RMSUMMARY + +# Appender for ResourceManager Application Summary Log +# Requires the following properties to be set +# - hadoop.log.dir (Hadoop Log directory) +# - yarn.server.resourcemanager.appsummary.log.file (resource manager app summary log filename) +# - yarn.server.resourcemanager.appsummary.logger (resource manager app summary log level and appender) + +#log4j.logger.org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary=${yarn.server.resourcemanager.appsummary.logger} +#log4j.additivity.org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary=false +#log4j.appender.RMSUMMARY=org.apache.log4j.RollingFileAppender +#log4j.appender.RMSUMMARY.File=${hadoop.log.dir}/${yarn.server.resourcemanager.appsummary.log.file} +#log4j.appender.RMSUMMARY.MaxFileSize=256MB +#log4j.appender.RMSUMMARY.MaxBackupIndex=20 +#log4j.appender.RMSUMMARY.layout=org.apache.log4j.PatternLayout +#log4j.appender.RMSUMMARY.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n diff --git a/hadoop/roles/common/templates/hadoop_ha_conf/mapred-site.xml.j2 b/hadoop/roles/common/templates/hadoop_ha_conf/mapred-site.xml.j2 new file mode 100644 index 0000000..4a839c9 --- /dev/null +++ b/hadoop/roles/common/templates/hadoop_ha_conf/mapred-site.xml.j2 @@ -0,0 +1,120 @@ + + + + mapred.job.tracker + {{ hadoop['mapred_job_tracker_ha_servicename'] }} + + + + mapred.jobtrackers.{{ hadoop['mapred_job_tracker_ha_servicename'] }} + {{ groups['hadoop_masters'] | join(',') }} + Comma-separated list of JobTracker IDs. + + + + mapred.ha.automatic-failover.enabled + true + + + + mapred.ha.zkfc.port + {{ hadoop['mapred_ha_zkfc_port'] }} + + + + mapred.ha.fencing.methods + shell(/bin/true) + + + + ha.zookeeper.quorum + {{ groups.zookeeper_servers | join(':' ~ hadoop['zookeeper_clientport'] + ',') }}:{{ hadoop['zookeeper_clientport'] }} + + +{% for host in groups['hadoop_masters'] %} + + mapred.jobtracker.rpc-address.{{ hadoop['mapred_job_tracker_ha_servicename'] }}.{{ host }} + {{ host }}:{{ hadoop['mapred_job_tracker_port'] }} + +{% endfor %} +{% for host in groups['hadoop_masters'] %} + + mapred.job.tracker.http.address.{{ hadoop['mapred_job_tracker_ha_servicename'] }}.{{ host }} + 0.0.0.0:{{ hadoop['mapred_job_tracker_http_address_port'] }} + +{% endfor %} +{% for host in groups['hadoop_masters'] %} + + mapred.ha.jobtracker.rpc-address.{{ hadoop['mapred_job_tracker_ha_servicename'] }}.{{ host }} + {{ host }}:{{ hadoop['mapred_ha_jobtracker_rpc-address_port'] }} + +{% endfor %} +{% for host in groups['hadoop_masters'] %} + + mapred.ha.jobtracker.http-redirect-address.{{ hadoop['mapred_job_tracker_ha_servicename'] }}.{{ host }} + {{ host }}:{{ hadoop['mapred_job_tracker_http_address_port'] }} + +{% endfor %} + + + mapred.jobtracker.restart.recover + true + + + + mapred.job.tracker.persist.jobstatus.active + true + + + + mapred.job.tracker.persist.jobstatus.hours + 1 + + + + mapred.job.tracker.persist.jobstatus.dir + {{ hadoop['mapred_job_tracker_persist_jobstatus_dir'] }} + + + + mapred.client.failover.proxy.provider.{{ hadoop['mapred_job_tracker_ha_servicename'] }} + org.apache.hadoop.mapred.ConfiguredFailoverProxyProvider + + + + mapred.client.failover.max.attempts + 15 + + + + mapred.client.failover.sleep.base.millis + 500 + + + + mapred.client.failover.sleep.max.millis + 1500 + + + + mapred.client.failover.connection.retries + 0 + + + + mapred.client.failover.connection.retries.on.timeouts + 0 + + + + + mapred.local.dir + {{ hadoop["mapred_local_dir"] | join(',') }} + + + + mapred.task.tracker.http.address + 0.0.0.0:{{ hadoop['mapred_task_tracker_http_address_port'] }} + + + diff --git a/hadoop/roles/common/templates/hadoop_ha_conf/slaves.j2 b/hadoop/roles/common/templates/hadoop_ha_conf/slaves.j2 new file mode 100644 index 0000000..44f97e2 --- /dev/null +++ b/hadoop/roles/common/templates/hadoop_ha_conf/slaves.j2 @@ -0,0 +1,3 @@ +{% for host in groups['hadoop_slaves'] %} +{{ host }} +{% endfor %} diff --git a/hadoop/roles/common/templates/hadoop_ha_conf/ssl-client.xml.example.j2 b/hadoop/roles/common/templates/hadoop_ha_conf/ssl-client.xml.example.j2 new file mode 100644 index 0000000..a50dce4 --- /dev/null +++ b/hadoop/roles/common/templates/hadoop_ha_conf/ssl-client.xml.example.j2 @@ -0,0 +1,80 @@ + + + + + + + ssl.client.truststore.location + + Truststore to be used by clients like distcp. Must be + specified. + + + + + ssl.client.truststore.password + + Optional. Default value is "". + + + + + ssl.client.truststore.type + jks + Optional. The keystore file format, default value is "jks". + + + + + ssl.client.truststore.reload.interval + 10000 + Truststore reload check interval, in milliseconds. + Default value is 10000 (10 seconds). + + + + + ssl.client.keystore.location + + Keystore to be used by clients like distcp. Must be + specified. + + + + + ssl.client.keystore.password + + Optional. Default value is "". + + + + + ssl.client.keystore.keypassword + + Optional. Default value is "". + + + + + ssl.client.keystore.type + jks + Optional. The keystore file format, default value is "jks". + + + + diff --git a/hadoop/roles/common/templates/hadoop_ha_conf/ssl-server.xml.example.j2 b/hadoop/roles/common/templates/hadoop_ha_conf/ssl-server.xml.example.j2 new file mode 100644 index 0000000..4b363ff --- /dev/null +++ b/hadoop/roles/common/templates/hadoop_ha_conf/ssl-server.xml.example.j2 @@ -0,0 +1,77 @@ + + + + + + + ssl.server.truststore.location + + Truststore to be used by NN and DN. Must be specified. + + + + + ssl.server.truststore.password + + Optional. Default value is "". + + + + + ssl.server.truststore.type + jks + Optional. The keystore file format, default value is "jks". + + + + + ssl.server.truststore.reload.interval + 10000 + Truststore reload check interval, in milliseconds. + Default value is 10000 (10 seconds). + + + + ssl.server.keystore.location + + Keystore to be used by NN and DN. Must be specified. + + + + + ssl.server.keystore.password + + Must be specified. + + + + + ssl.server.keystore.keypassword + + Must be specified. + + + + + ssl.server.keystore.type + jks + Optional. The keystore file format, default value is "jks". + + + + diff --git a/hadoop/roles/common/templates/iptables.j2 b/hadoop/roles/common/templates/iptables.j2 new file mode 100644 index 0000000..f9814fc --- /dev/null +++ b/hadoop/roles/common/templates/iptables.j2 @@ -0,0 +1,41 @@ +# Firewall configuration written by system-config-firewall +# Manual customization of this file is not recommended_ +*filter +:INPUT ACCEPT [0:0] +:FORWARD ACCEPT [0:0] +:OUTPUT ACCEPT [0:0] +{% if 'hadoop_masters' in group_names %} +-A INPUT -p tcp --dport {{ hadoop['fs_default_FS_port'] }} -j ACCEPT +-A INPUT -p tcp --dport {{ hadoop['dfs_namenode_http_address_port'] }} -j ACCEPT +-A INPUT -p tcp --dport {{ hadoop['mapred_job_tracker_port'] }} -j ACCEPT +-A INPUT -p tcp --dport {{ hadoop['mapred_job_tracker_http_address_port'] }} -j ACCEPT +-A INPUT -p tcp --dport {{ hadoop['mapred_ha_jobtracker_rpc-address_port'] }} -j ACCEPT +-A INPUT -p tcp --dport {{ hadoop['mapred_ha_zkfc_port'] }} -j ACCEPT +-A INPUT -p tcp --dport {{ hadoop['dfs_ha_zkfc_port'] }} -j ACCEPT + +{% endif %} +{% if 'hadoop_slaves' in group_names %} +-A INPUT -p tcp --dport {{ hadoop['dfs_datanode_address_port'] }} -j ACCEPT +-A INPUT -p tcp --dport {{ hadoop['dfs_datanode_http_address_port'] }} -j ACCEPT +-A INPUT -p tcp --dport {{ hadoop['dfs_datanode_ipc_address_port'] }} -j ACCEPT +-A INPUT -p tcp --dport {{ hadoop['mapred_task_tracker_http_address_port'] }} -j ACCEPT +{% endif %} +{% if 'qjournal_servers' in group_names %} +-A INPUT -p tcp --dport {{ hadoop['qjournal_port'] }} -j ACCEPT +-A INPUT -p tcp --dport {{ hadoop['qjournal_http_port'] }} -j ACCEPT +{% endif %} +{% if 'zookeeper_servers' in group_names %} +-A INPUT -p tcp --dport {{ hadoop['zookeeper_clientport'] }} -j ACCEPT +-A INPUT -p tcp --dport {{ hadoop['zookeeper_leader_port'] }} -j ACCEPT +-A INPUT -p tcp --dport {{ hadoop['zookeeper_election_port'] }} -j ACCEPT +{% endif %} +-A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT +-A INPUT -p icmp -j ACCEPT +-A INPUT -i lo -j ACCEPT +-A INPUT -m state --state NEW -m tcp -p tcp --dport 22 -j ACCEPT +-A INPUT -j REJECT --reject-with icmp-host-prohibited +-A FORWARD -j REJECT --reject-with icmp-host-prohibited +COMMIT + + + diff --git a/hadoop/roles/hadoop_primary/handlers/main.yml b/hadoop/roles/hadoop_primary/handlers/main.yml new file mode 100644 index 0000000..c493318 --- /dev/null +++ b/hadoop/roles/hadoop_primary/handlers/main.yml @@ -0,0 +1,14 @@ +--- +# Handlers for the hadoop master services + +- name: restart hadoop master services + service: name=${item} state=restarted + with_items: + - hadoop-0.20-mapreduce-jobtracker + - hadoop-hdfs-namenode + +- name: restart hadoopha master services + service: name=${item} state=restarted + with_items: + - hadoop-0.20-mapreduce-jobtrackerha + - hadoop-hdfs-namenode diff --git a/hadoop/roles/hadoop_primary/tasks/hadoop_master.yml b/hadoop/roles/hadoop_primary/tasks/hadoop_master.yml new file mode 100644 index 0000000..67632ec --- /dev/null +++ b/hadoop/roles/hadoop_primary/tasks/hadoop_master.yml @@ -0,0 +1,38 @@ +--- +# Playbook for Hadoop master servers + +- name: Install the namenode and jobtracker packages + yum: name={{ item }} state=installed + with_items: + - hadoop-0.20-mapreduce-jobtrackerha + - hadoop-hdfs-namenode + - hadoop-hdfs-zkfc + - hadoop-0.20-mapreduce-zkfc + +- name: Copy the hadoop configuration files + template: src=roles/common/templates/hadoop_ha_conf/{{ item }}.j2 dest=/etc/hadoop/conf/{{ item }} + with_items: + - core-site.xml + - hadoop-metrics.properties + - hadoop-metrics2.properties + - hdfs-site.xml + - log4j.properties + - mapred-site.xml + - slaves + - ssl-client.xml.example + - ssl-server.xml.example + notify: restart hadoopha master services + +- name: Create the data directory for the namenode metadata + file: path={{ item }} owner=hdfs group=hdfs state=directory + with_items: hadoop.dfs_namenode_name_dir + +- name: Create the data directory for the jobtracker ha + file: path={{ item }} owner=mapred group=mapred state=directory + with_items: hadoop.mapred_job_tracker_persist_jobstatus_dir + +- name: Format the namenode + shell: creates=/usr/lib/hadoop/namenode.formatted su - hdfs -c "hadoop namenode -format"; touch /usr/lib/hadoop/namenode.formatted + +- name: start hadoop namenode services + service: name=hadoop-hdfs-namenode state=started diff --git a/hadoop/roles/hadoop_primary/tasks/hadoop_master_no_ha.yml b/hadoop/roles/hadoop_primary/tasks/hadoop_master_no_ha.yml new file mode 100644 index 0000000..3508c92 --- /dev/null +++ b/hadoop/roles/hadoop_primary/tasks/hadoop_master_no_ha.yml @@ -0,0 +1,38 @@ +--- +# Playbook for Hadoop master servers + +- name: Install the namenode and jobtracker packages + yum: name={{ item }} state=installed + with_items: + - hadoop-0.20-mapreduce-jobtracker + - hadoop-hdfs-namenode + +- name: Copy the hadoop configuration files for no ha + template: src=roles/common/templates/hadoop_conf/{{ item }}.j2 dest=/etc/hadoop/conf/{{ item }} + with_items: + - core-site.xml + - hadoop-metrics.properties + - hadoop-metrics2.properties + - hdfs-site.xml + - log4j.properties + - mapred-site.xml + - slaves + - ssl-client.xml.example + - ssl-server.xml.example + notify: restart hadoop master services + +- name: Create the data directory for the namenode metadata + file: path={{ item }} owner=hdfs group=hdfs state=directory + with_items: hadoop.dfs_namenode_name_dir + +- name: Format the namenode + shell: creates=/usr/lib/hadoop/namenode.formatted su - hdfs -c "hadoop namenode -format"; touch /usr/lib/hadoop/namenode.formatted + +- name: start hadoop namenode services + service: name=hadoop-hdfs-namenode state=started + +- name: Give permissions for mapred users + shell: creates=/usr/lib/hadoop/fs.initialized su - hdfs -c "hadoop fs -chown hdfs:hadoop /"; su - hdfs -c "hadoop fs -chmod 0774 /"; touch /usr/lib/hadoop/namenode.initialized + +- name: start hadoop jobtracker services + service: name=hadoop-0.20-mapreduce-jobtracker state=started diff --git a/hadoop/roles/hadoop_primary/tasks/main.yml b/hadoop/roles/hadoop_primary/tasks/main.yml new file mode 100644 index 0000000..9693414 --- /dev/null +++ b/hadoop/roles/hadoop_primary/tasks/main.yml @@ -0,0 +1,9 @@ +--- +# Playbook for Hadoop master primary servers + +- include: hadoop_master.yml + when: ha_enabled + +- include: hadoop_master_no_ha.yml + when: not ha_enabled + diff --git a/hadoop/roles/hadoop_secondary/handlers/main.yml b/hadoop/roles/hadoop_secondary/handlers/main.yml new file mode 100644 index 0000000..c493318 --- /dev/null +++ b/hadoop/roles/hadoop_secondary/handlers/main.yml @@ -0,0 +1,14 @@ +--- +# Handlers for the hadoop master services + +- name: restart hadoop master services + service: name=${item} state=restarted + with_items: + - hadoop-0.20-mapreduce-jobtracker + - hadoop-hdfs-namenode + +- name: restart hadoopha master services + service: name=${item} state=restarted + with_items: + - hadoop-0.20-mapreduce-jobtrackerha + - hadoop-hdfs-namenode diff --git a/hadoop/roles/hadoop_secondary/tasks/main.yml b/hadoop/roles/hadoop_secondary/tasks/main.yml new file mode 100644 index 0000000..4836c45 --- /dev/null +++ b/hadoop/roles/hadoop_secondary/tasks/main.yml @@ -0,0 +1,64 @@ +--- +# Playbook for Hadoop master secondary server + + +- name: Install the namenode and jobtracker packages + yum: name=${item} state=installed + with_items: + - hadoop-0.20-mapreduce-jobtrackerha + - hadoop-hdfs-namenode + - hadoop-hdfs-zkfc + - hadoop-0.20-mapreduce-zkfc + +- name: Copy the hadoop configuration files + template: src=roles/common/templates/hadoop_ha_conf/{{ item }}.j2 dest=/etc/hadoop/conf/{{ item }} + with_items: + - core-site.xml + - hadoop-metrics.properties + - hadoop-metrics2.properties + - hdfs-site.xml + - log4j.properties + - mapred-site.xml + - slaves + - ssl-client.xml.example + - ssl-server.xml.example + notify: restart hadoopha master services + +- name: Create the data directory for the namenode metadata + file: path={{ item }} owner=hdfs group=hdfs state=directory + with_items: hadoop.dfs_namenode_name_dir + +- name: Create the data directory for the jobtracker ha + file: path={{ item }} owner=mapred group=mapred state=directory + with_items: hadoop.mapred_job_tracker_persist_jobstatus_dir + + +- name: Initialize the secodary namenode + shell: creates=/usr/lib/hadoop/namenode.formatted su - hdfs -c "hadoop namenode -bootstrapStandby"; touch /usr/lib/hadoop/namenode.formatted + +- name: start hadoop namenode services + service: name=hadoop-hdfs-namenode state=started + +- name: Initialize the zkfc for namenode + shell: creates=/usr/lib/hadoop/zkfc.formatted su - hdfs -c "hdfs zkfc -formatZK"; touch /usr/lib/hadoop/zkfc.formatted + +- name: start zkfc for namenodes + service: name=hadoop-hdfs-zkfc state=started + delegate_to: ${item} + with_items: groups.hadoop_masters + +- name: Give permissions for mapred users + shell: creates=/usr/lib/hadoop/fs.initialized su - hdfs -c "hadoop fs -chown hdfs:hadoop /"; su - hdfs -c "hadoop fs -chmod 0774 /"; touch /usr/lib/hadoop/namenode.initialized + +- name: Initialize the zkfc for jobtracker + shell: creates=/usr/lib/hadoop/zkfcjob.formatted su - mapred -c "hadoop mrzkfc -formatZK"; touch /usr/lib/hadoop/zkfcjob.formatted + +- name: start zkfc for jobtracker + service: name=hadoop-0.20-mapreduce-zkfc state=started + delegate_to: '{{ item }}' + with_items: groups.hadoop_masters + +- name: start hadoop Jobtracker services + service: name=hadoop-0.20-mapreduce-jobtrackerha state=started + delegate_to: '{{ item }}' + with_items: groups.hadoop_masters diff --git a/hadoop/roles/hadoop_slaves/handlers/main.yml b/hadoop/roles/hadoop_slaves/handlers/main.yml new file mode 100644 index 0000000..82ccc5c --- /dev/null +++ b/hadoop/roles/hadoop_slaves/handlers/main.yml @@ -0,0 +1,8 @@ +--- +# Handlers for the hadoop slave services + +- name: restart hadoop slave services + service: name=${item} state=restarted + with_items: + - hadoop-0.20-mapreduce-tasktracker + - hadoop-hdfs-datanode diff --git a/hadoop/roles/hadoop_slaves/tasks/main.yml b/hadoop/roles/hadoop_slaves/tasks/main.yml new file mode 100644 index 0000000..9056ea2 --- /dev/null +++ b/hadoop/roles/hadoop_slaves/tasks/main.yml @@ -0,0 +1,4 @@ +--- +# Playbook for Hadoop slave servers + +- include: slaves.yml tags=slaves diff --git a/hadoop/roles/hadoop_slaves/tasks/slaves.yml b/hadoop/roles/hadoop_slaves/tasks/slaves.yml new file mode 100644 index 0000000..2807bac --- /dev/null +++ b/hadoop/roles/hadoop_slaves/tasks/slaves.yml @@ -0,0 +1,53 @@ +--- +# Playbook for Hadoop slave servers + +- name: Install the datanode and tasktracker packages + yum: name=${item} state=installed + with_items: + - hadoop-0.20-mapreduce-tasktracker + - hadoop-hdfs-datanode + +- name: Copy the hadoop configuration files + template: src=roles/common/templates/hadoop_ha_conf/${item}.j2 dest=/etc/hadoop/conf/${item} + with_items: + - core-site.xml + - hadoop-metrics.properties + - hadoop-metrics2.properties + - hdfs-site.xml + - log4j.properties + - mapred-site.xml + - slaves + - ssl-client.xml.example + - ssl-server.xml.example + when: ha_enabled + notify: restart hadoop slave services + +- name: Copy the hadoop configuration files for non ha + template: src=roles/common/templates/hadoop_conf/${item}.j2 dest=/etc/hadoop/conf/${item} + with_items: + - core-site.xml + - hadoop-metrics.properties + - hadoop-metrics2.properties + - hdfs-site.xml + - log4j.properties + - mapred-site.xml + - slaves + - ssl-client.xml.example + - ssl-server.xml.example + when: not ha_enabled + notify: restart hadoop slave services + +- name: Create the data directory for the slave nodes to store the data + file: path={{ item }} owner=hdfs group=hdfs state=directory + with_items: hadoop.dfs_datanode_data_dir + +- name: Create the data directory for the slave nodes for mapreduce + file: path={{ item }} owner=mapred group=mapred state=directory + with_items: hadoop.mapred_local_dir + +- name: start hadoop slave services + service: name={{ item }} state=started + with_items: + - hadoop-0.20-mapreduce-tasktracker + - hadoop-hdfs-datanode + diff --git a/hadoop/roles/qjournal_servers/handlers/main.yml b/hadoop/roles/qjournal_servers/handlers/main.yml new file mode 100644 index 0000000..a466737 --- /dev/null +++ b/hadoop/roles/qjournal_servers/handlers/main.yml @@ -0,0 +1,5 @@ +--- +# The journal node handlers + +- name: restart qjournal services + service: name=hadoop-hdfs-journalnode state=restarted diff --git a/hadoop/roles/qjournal_servers/tasks/main.yml b/hadoop/roles/qjournal_servers/tasks/main.yml new file mode 100644 index 0000000..86fa9e3 --- /dev/null +++ b/hadoop/roles/qjournal_servers/tasks/main.yml @@ -0,0 +1,22 @@ +--- +# Playbook for the qjournal nodes + +- name: Install the qjournal package + yum: name=hadoop-hdfs-journalnode state=installed + +- name: Create folder for Journaling + file: path={{ hadoop.dfs_journalnode_edits_dir }} state=directory owner=hdfs group=hdfs + +- name: Copy the hadoop configuration files + template: src=roles/common/templates/hadoop_ha_conf/{{ item }}.j2 dest=/etc/hadoop/conf/{{ item }} + with_items: + - core-site.xml + - hadoop-metrics.properties + - hadoop-metrics2.properties + - hdfs-site.xml + - log4j.properties + - mapred-site.xml + - slaves + - ssl-client.xml.example + - ssl-server.xml.example + notify: restart qjournal services diff --git a/hadoop/roles/zookeeper_servers/handlers/main.yml b/hadoop/roles/zookeeper_servers/handlers/main.yml new file mode 100644 index 0000000..b0a9cc1 --- /dev/null +++ b/hadoop/roles/zookeeper_servers/handlers/main.yml @@ -0,0 +1,5 @@ +--- +# Handler for the zookeeper services + +- name: restart zookeeper + service: name=zookeeper-server state=restarted diff --git a/hadoop/roles/zookeeper_servers/tasks/main.yml b/hadoop/roles/zookeeper_servers/tasks/main.yml new file mode 100644 index 0000000..c791a39 --- /dev/null +++ b/hadoop/roles/zookeeper_servers/tasks/main.yml @@ -0,0 +1,13 @@ +--- +# The plays for zookeper daemons + +- name: Install the zookeeper files + yum: name=zookeeper-server state=installed + +- name: Copy the configuration file for zookeeper + template: src=zoo.cfg.j2 dest=/etc/zookeeper/conf/zoo.cfg + notify: restart zookeeper + +- name: initialize the zookeper + shell: creates=/var/lib/zookeeper/myid service zookeeper-server init --myid=${zoo_id} + diff --git a/hadoop/roles/zookeeper_servers/templates/zoo.cfg.j2 b/hadoop/roles/zookeeper_servers/templates/zoo.cfg.j2 new file mode 100644 index 0000000..a5e3f9f --- /dev/null +++ b/hadoop/roles/zookeeper_servers/templates/zoo.cfg.j2 @@ -0,0 +1,9 @@ +tickTime=2000 +dataDir=/var/lib/zookeeper/ +clientPort={{ hadoop['zookeeper_clientport'] }} +initLimit=5 +syncLimit=2 +{% for host in groups['zookeeper_servers'] %} +server.{{ hostvars[host].zoo_id }}={{ host }}:{{ hadoop['zookeeper_leader_port'] }}:{{ hadoop['zookeeper_election_port'] }} +{% endfor %} + diff --git a/hadoop/site.yml b/hadoop/site.yml new file mode 100644 index 0000000..cbe33a9 --- /dev/null +++ b/hadoop/site.yml @@ -0,0 +1,28 @@ +--- +# The main playbook to deploy the site + +- hosts: hadoop_all + roles: + - common + +- hosts: zookeeper_servers + roles: + - { role: zookeeper_servers, when: ha_enabled } + +- hosts: qjournal_servers + roles: + - { role: qjournal_servers, when: ha_enabled } + +- hosts: hadoop_master_primary + roles: + - { role: hadoop_primary } + +- hosts: hadoop_master_secondary + roles: + - { role: hadoop_secondary, when: ha_enabled } + +- hosts: hadoop_slaves + roles: + - { role: hadoop_slaves } + +