@ -0,0 +1,359 @@ |
||||
# Deploying Hadoop Clusters using Ansible |
||||
|
||||
## Preface |
||||
|
||||
The playbooks in this example are designed to deploy a Hadoop cluster on a |
||||
CentOS 6 or RHEL 6 environment using Ansible. The playbooks can: |
||||
|
||||
1) Deploy a fully functional Hadoop cluster with HA and automatic failover. |
||||
|
||||
2) Deploy a fully functional Hadoop cluster with no HA. |
||||
|
||||
3) Deploy additional nodes to scale the cluster |
||||
|
||||
These playbooks require Ansible 1.2, CentOS 6 or RHEL 6 target machines, and install |
||||
the open-source Cloudera Hadoop Distribution (CDH) version 4. |
||||
|
||||
## Hadoop Components |
||||
|
||||
Hadoop is framework that allows processing of large datasets across large |
||||
clusters. The two main components that make up a Hadoop cluster are the HDFS |
||||
Filesystem and the MapReduce framework. Briefly, the HDFS filesystem is responsible |
||||
for storing data across the cluster nodes on its local disks. The MapReduce |
||||
jobs are the tasks that would run on these nodes to get a meaningful result |
||||
using the data stored on the HDFS filesystem. |
||||
|
||||
Let's have a closer look at each of these components. |
||||
|
||||
## HDFS |
||||
|
||||
![Alt text](/images/hdfs.png "HDFS") |
||||
|
||||
The above diagram illustrates an HDFS filesystem. The cluster consists of three |
||||
DataNodes which are responsible for storing/replicating data, while the NameNode |
||||
is a process which is responsible for storing the metadata for the entire |
||||
filesystem. As the example illustrates above, when a client wants to write a |
||||
file to the HDFS cluster it first contacts the NameNode and lets it know that |
||||
it want to write a file. The NameNode then decides where and how the file |
||||
should be saved and notifies the client about its decision. |
||||
|
||||
In the given example "File1" has a size of 128MB and the block size of the HDFS |
||||
filesystem is 64 MB. Hence, the NameNode instructs the client to break down the |
||||
file into two different blocks and write the first block to DataNode 1 and the |
||||
second block to DataNode 2. Upon receiving the notification from the NameNode, |
||||
the client contacts DataNode 1 and DataNode 2 directly to write the data. |
||||
|
||||
Once the data is recieved by the DataNodes, they replicate the block across the |
||||
other nodes. The number of nodes across which the data would be replicated is |
||||
based on the HDFS configuration, the default value being 3. Meanwhile the |
||||
NameNode updates its metadata with the entry of the new file "File1" and the |
||||
locations where the parts are stored. |
||||
|
||||
## MapReduce |
||||
|
||||
MapReduce is a Java application that utilizes the data stored in the |
||||
HDFS filesystem to get some useful and meaningful result. The whole job is |
||||
split into two parts: the "Map" job and the "Reduce" Job. |
||||
|
||||
Let's consider an example. In the previous step we had uploaded the "File1" |
||||
into the HDFS filesystem and the file was broken down into two different |
||||
blocks. Let's assume that the first block had the data "black sheep" in it and |
||||
the second block has the data "white sheep" in it. Now let's assume a client |
||||
wants to get count of all the words occurring in "File1". In order to get the |
||||
count, the first thing the client would have to do is write a "Map" program |
||||
then a "Reduce" program. |
||||
|
||||
Here's a psudeo code of how the Map and Reduce jobs might look: |
||||
|
||||
mapper (File1, file-contents): |
||||
for each word in file-contents: |
||||
emit (word, 1) |
||||
|
||||
reducer (word, values): |
||||
sum = 0 |
||||
for each value in values: |
||||
sum = sum + value |
||||
emit (word, sum) |
||||
|
||||
The work of the Map job is to go through all the words in the file and emit |
||||
a key/value pair. In this case the key is the word itself and value is always |
||||
1. |
||||
|
||||
The Reduce job is quite simple: it increments the value of sum by 1, for each |
||||
value it gets. |
||||
|
||||
Once the Map and Reduce jobs are ready, the client would instruct the |
||||
"JobTracker" (the process resposible for scheduling the jobs on the cluster) |
||||
to run the MapReduce job on "File1" |
||||
|
||||
Let's have closer look at the anotomy of a Map job. |
||||
|
||||
|
||||
![Alt text](/images/map.png "Map job") |
||||
|
||||
As the figure above shows, when the client instructs the JobTracker to run a |
||||
job on File1, the JobTracker first contacts the NameNode to determine where the |
||||
blocks of the File1 are. Then the JobTracker sends the Map job's JAR file down |
||||
to the nodes having the blocks, and the TaskTracker process those nodes to run |
||||
the application. |
||||
|
||||
In the above example, DataNode 1 and DataNode 2 havw the blocks, so the |
||||
TaskTrackers on those nodes run the Map jobs. Once the jobs are completed the |
||||
two nodes would have key/value results as below: |
||||
|
||||
MapJob Results: |
||||
|
||||
TaskTracker1: |
||||
"Black: 1" |
||||
"Sheep: 1" |
||||
|
||||
TaskTracker2: |
||||
"White: 1" |
||||
"Sheep: 1" |
||||
|
||||
|
||||
Once the Map phase is completed the JobTracker process initiates the Shuffle |
||||
and Reduce process. |
||||
|
||||
Let's have closer look at the Shuffle-Reduce job. |
||||
|
||||
![Alt text](/images/reduce.png "Reduce job") |
||||
|
||||
As the figure above demonstrates, the first thing that the JobTracker does is |
||||
spawn a Reducer job on the DataNode/Tasktracker nodes for each "key" in the job |
||||
result. In this case we have three keys: "black, white, sheep" in our result, |
||||
so three Reducers are spawned: one for each key. The Map jobs shuffle the keys |
||||
out to the respective Reduce jobs. Then the Reduce job code runs and the sum is |
||||
calculated, and the result is written into the HDFS filesystem in a common |
||||
directory. In the above example the output directory is specified as |
||||
"/home/ben/output" so all the Reducers will write their results into this |
||||
directory under different filenames; the file names being "part-00xx", where x |
||||
is the Reducer/partition number. |
||||
|
||||
|
||||
## Hadoop Deployment |
||||
|
||||
![Alt text](/images/hadoop.png "Reduce job") |
||||
|
||||
The above diagram depicts a typical Hadoop deployment. The NameNode and |
||||
JobTracker usually reside on the same machine, though they can run on seperate |
||||
machines. The DataNodes and TaskTrackers run on the same node. The size of the |
||||
cluster can be scaled to thousands of nodes with petabytes of storage. |
||||
|
||||
The above deployment model provides redundancy for data as the HDFS filesytem |
||||
takes care of the data replication. The only single point of failure are the |
||||
NameNode and the TaskTracker. If any of these components fail the cluster will |
||||
not be usable. |
||||
|
||||
|
||||
## Making Hadoop HA |
||||
|
||||
To make the Hadoop cluster highly available we would have to add another set of |
||||
JobTracker/NameNodes, and make sure that the data updated by the master is also |
||||
somehow also updated by the client. In case of failure of the primary node, the |
||||
secondary node takes over that role. |
||||
|
||||
The first thing that has to be dealt with is the data held by the NameNode. As |
||||
we recall, the NameNode holds all of the metadata about the filesystem, so any |
||||
update to the metadata should also be reflected on the secondary NameNode's |
||||
metadata copy. The synchronization of the primary and seconary NameNode |
||||
metadata is handled by the Quorum Journal Manager. |
||||
|
||||
|
||||
### Quorum Journal Manager |
||||
|
||||
![Alt text](/images/qjm.png "QJM") |
||||
|
||||
As the figure above shows the Quorum Journal manager consists of the journal |
||||
manager client and journal manager nodes. The journal manager clients reside |
||||
on the same node as the NameNodes, and in case of primary node, collects all the |
||||
edits logs happening on the NameNode and sends it out to the Journal nodes. The |
||||
journal manager client residing on the secondary namenode regurlary contacts |
||||
the journal nodes and updates its local metadata to be consistant with the |
||||
master node. In case of primary node failure the secondary NameNode updates |
||||
itself to the latest edit logs and takes over as the primary NameNode. |
||||
|
||||
|
||||
### Zookeeper |
||||
|
||||
Apart from data consistency, a distributed cluster system also needs a |
||||
mechanism for centralized coordination. For example, there should be a way for |
||||
the secondary node to tell if the primary node is running properly, and if not |
||||
it has to take up the role of the primary. Zookeeper provides Hadoop with a |
||||
mechanism to coordinate in this way. |
||||
|
||||
![Alt text](/images/zookeeper.png "Zookeeper") |
||||
|
||||
As the figure above shows, the Zookeeper services are client/server baseds |
||||
service. The server component itself is replicated over a set of machines that |
||||
comprise the service. In short, high availability is built into the Zookeeper |
||||
servers. |
||||
|
||||
For Hadoop, two Zookeeper clients have been built: ZKFC (Zookeeper Failover |
||||
Controller), one for the NameNode and one for JobTracker. These clients run on |
||||
the same machines as the NameNode/JobTrackers themselves. |
||||
|
||||
When a ZKFC client is started, it establishes a connection with one of the |
||||
Zookeeper nodes and obtains a session ID. The client then keeps a health check |
||||
on the NameNode/JobTracker and sends heartbeats to the ZooKeeper. |
||||
|
||||
If the ZKFC client detects a failure of the NameNode/JobTracker, it removes |
||||
itself from the ZooKeeper active/standby election, and the other ZKFC client |
||||
fences the node/service and takes over the primary role. |
||||
|
||||
|
||||
## Hadoop HA Deployment |
||||
|
||||
![Alt text](/images/hadoopha.png "Hadoop_HA") |
||||
|
||||
The above diagram depicts a fully HA Hadoop Cluster with no single point of |
||||
failure and automated failover. |
||||
|
||||
|
||||
## Deploying Hadoop Clusters with Ansible |
||||
|
||||
Setting up a Hadoop cluster without HA itself can be a challenging and |
||||
time-consuming task, and with HA, things become even more difficult. |
||||
|
||||
Ansible can automate the whole process of deploying a Hadoop cluster with or |
||||
without HA with the same playbook, in a matter of minutes. This can be used for |
||||
quick environment rebuild, or in case of disaster or node failures, recovery |
||||
time can be greatly reduced with Ansible automation. |
||||
|
||||
Let's have a look to see how this is done. |
||||
|
||||
## Deploying a Hadoop cluster with HA |
||||
|
||||
### Prerequisites |
||||
|
||||
These playbooks have been tested using Ansible v1.2, and CentOS 6.x (64 bit) |
||||
|
||||
Modify group_vars/all to choose the network interface for Hadoop communication. |
||||
|
||||
Optionally you change the Hadoop-specific parameters like ports or directories |
||||
by editing group_vars/all file. |
||||
|
||||
Before launching the deployment playbook make sure the inventory file (hosts) |
||||
is set up properly. Here's a sample: |
||||
|
||||
[hadoop_master_primary] |
||||
zhadoop1 |
||||
|
||||
[hadoop_master_secondary] |
||||
zhadoop2 |
||||
|
||||
[hadoop_masters:children] |
||||
hadoop_master_primary |
||||
hadoop_master_secondary |
||||
|
||||
[hadoop_slaves] |
||||
hadoop1 |
||||
hadoop2 |
||||
hadoop3 |
||||
|
||||
[qjournal_servers] |
||||
zhadoop1 |
||||
zhadoop2 |
||||
zhadoop3 |
||||
|
||||
[zookeeper_servers] |
||||
zhadoop1 zoo_id=1 |
||||
zhadoop2 zoo_id=2 |
||||
zhadoop3 zoo_id=3 |
||||
|
||||
Once the inventory is set up, the Hadoop cluster can be setup using the following |
||||
command |
||||
|
||||
ansible-playbook -i hosts site.yml |
||||
|
||||
Once deployed, we can check the cluster sanity in different ways. To check the |
||||
status of the HDFS filesystem and a report on all the DataNodes, log in as the |
||||
'hdfs' user on any Hadoop master server, and issue the following command to get |
||||
the report: |
||||
|
||||
hadoop dfsadmin -report |
||||
|
||||
To check the sanity of HA, first log in as the 'hdfs' user on any Hadoop master |
||||
server and get the current active/standby NameNode servers this way: |
||||
|
||||
-bash-4.1$ hdfs haadmin -getServiceState zhadoop1 |
||||
active |
||||
-bash-4.1$ hdfs haadmin -getServiceState zhadoop2 |
||||
standby |
||||
|
||||
To get the state of the JobTracker process login as the 'mapred' user on any |
||||
Hadoop master server and issue the following command: |
||||
|
||||
-bash-4.1$ hadoop mrhaadmin -getServiceState hadoop1 |
||||
standby |
||||
-bash-4.1$ hadoop mrhaadmin -getServiceState hadoop2 |
||||
active |
||||
|
||||
Once you have determined which server is active and which is standby, you can |
||||
kill the NameNode/JobTracker process on the server listed as active and issue |
||||
the same commands again, and you should see that the standby has been promoted |
||||
to the active state. Later, you can restart the killed process and see that node |
||||
listed as standby. |
||||
|
||||
### Running a MapReduce Job |
||||
|
||||
To deploy the mapreduce job run the following script from any of the hadoop |
||||
master nodes as user 'hdfs'. The job would count the number of occurance of the |
||||
word 'hello' in the given inputfile. Eg: su - hdfs -c "/tmp/job.sh" |
||||
|
||||
#!/bin/bash |
||||
cat > /tmp/inputfile << EOF |
||||
hello |
||||
sf |
||||
sdf |
||||
hello |
||||
sdf |
||||
sdf |
||||
EOF |
||||
hadoop fs -put /tmp/inputfile /inputfile |
||||
hadoop jar /usr/lib/hadoop-0.20-mapreduce/hadoop-examples.jar grep /inputfile /outputfile 'hello' |
||||
hadoop fs -get /outputfile /tmp/outputfile/ |
||||
|
||||
To verify the result, read the file on the server located at |
||||
/tmp/outputfile/part-00000, which should give you the count. |
||||
|
||||
## Scale the Cluster |
||||
|
||||
When the Hadoop cluster reaches its maximum capacity, it can be scaled by |
||||
adding nodes. This can be easily accomplished by adding the node hostname to |
||||
the Ansible inventory under the hadoop_slaves group, and running the following |
||||
command: |
||||
|
||||
ansible-playbook -i hosts site.yml --tags=slaves |
||||
|
||||
## Deploy a non-HA Hadoop Cluster |
||||
|
||||
The following diagram illustrates a standalone Hadoop cluster. |
||||
|
||||
To deploy this cluster fill in the inventory file as follows: |
||||
|
||||
[hadoop_master_primary] |
||||
zhadoop1 |
||||
|
||||
[hadoop_master_secondary] |
||||
|
||||
[hadoop_masters:children] |
||||
hadoop_master_primary |
||||
hadoop_master_secondary |
||||
|
||||
[hadoop_slaves] |
||||
hadoop1 |
||||
hadoop2 |
||||
hadoop3 |
||||
|
||||
Edit the group_vars/all file to disable HA: |
||||
|
||||
ha_enabled: False |
||||
|
||||
And run the following command: |
||||
|
||||
ansible-playbook -i hosts site.yml |
||||
|
||||
The validity of the cluster can be checked by running the same MapReduce job |
||||
that has documented above for an HA Hadoop cluster. |
||||
|
@ -0,0 +1,56 @@ |
||||
# Defaults to the first ethernet interface. Change this to: |
||||
# |
||||
# iface: eth1 |
||||
# |
||||
# ...to override. |
||||
# |
||||
iface: '{{ ansible_default_ipv4.interface }}' |
||||
|
||||
ha_enabled: False |
||||
|
||||
hadoop: |
||||
|
||||
#Variables for <core-site_xml> - common |
||||
|
||||
fs_default_FS_port: 8020 |
||||
nameservice_id: mycluster4 |
||||
|
||||
#Variables for <hdfs-site_xml> |
||||
|
||||
dfs_permissions_superusergroup: hdfs |
||||
dfs_namenode_name_dir: |
||||
- /namedir1/ |
||||
- /namedir2/ |
||||
dfs_replication: 3 |
||||
dfs_namenode_handler_count: 50 |
||||
dfs_blocksize: 67108864 |
||||
dfs_datanode_data_dir: |
||||
- /datadir1/ |
||||
- /datadir2/ |
||||
dfs_datanode_address_port: 50010 |
||||
dfs_datanode_http_address_port: 50075 |
||||
dfs_datanode_ipc_address_port: 50020 |
||||
dfs_namenode_http_address_port: 50070 |
||||
dfs_ha_zkfc_port: 8019 |
||||
qjournal_port: 8485 |
||||
qjournal_http_port: 8480 |
||||
dfs_journalnode_edits_dir: /journaldir/ |
||||
zookeeper_clientport: 2181 |
||||
zookeeper_leader_port: 2888 |
||||
zookeeper_election_port: 3888 |
||||
|
||||
#Variables for <mapred-site_xml> - common |
||||
mapred_job_tracker_ha_servicename: myjt4 |
||||
mapred_job_tracker_http_address_port: 50030 |
||||
mapred_task_tracker_http_address_port: 50060 |
||||
mapred_job_tracker_port: 8021 |
||||
mapred_ha_jobtracker_rpc-address_port: 8023 |
||||
mapred_ha_zkfc_port: 8018 |
||||
mapred_job_tracker_persist_jobstatus_dir: /jobdir/ |
||||
mapred_local_dir: |
||||
- /mapred1/ |
||||
- /mapred2/ |
||||
|
||||
|
||||
|
||||
|
@ -0,0 +1,31 @@ |
||||
[hadoop_all:children] |
||||
hadoop_masters |
||||
hadoop_slaves |
||||
qjournal_servers |
||||
zookeeper_servers |
||||
|
||||
[hadoop_master_primary] |
||||
hadoop1 |
||||
|
||||
[hadoop_master_secondary] |
||||
hadoop2 |
||||
|
||||
[hadoop_masters:children] |
||||
hadoop_master_primary |
||||
hadoop_master_secondary |
||||
|
||||
[hadoop_slaves] |
||||
hadoop1 |
||||
hadoop2 |
||||
hadoop3 |
||||
|
||||
[qjournal_servers] |
||||
hadoop1 |
||||
hadoop2 |
||||
hadoop3 |
||||
|
||||
[zookeeper_servers] |
||||
hadoop1 zoo_id=1 |
||||
hadoop2 zoo_id=2 |
||||
hadoop3 zoo_id=3 |
||||
|
After Width: | Height: | Size: 64 KiB |
After Width: | Height: | Size: 126 KiB |
After Width: | Height: | Size: 148 KiB |
After Width: | Height: | Size: 105 KiB |
After Width: | Height: | Size: 82 KiB |
After Width: | Height: | Size: 148 KiB |
After Width: | Height: | Size: 106 KiB |
@ -0,0 +1,19 @@ |
||||
asdf |
||||
sdf |
||||
sdf |
||||
sd |
||||
f |
||||
sf |
||||
sdf |
||||
sd |
||||
fsd |
||||
hello |
||||
asf |
||||
sf |
||||
sd |
||||
fsd |
||||
f |
||||
sdf |
||||
sd |
||||
hello |
||||
|
@ -0,0 +1,21 @@ |
||||
--- |
||||
# Launch Job to count occurance of a word. |
||||
|
||||
- hosts: $server |
||||
user: root |
||||
tasks: |
||||
- name: copy the file |
||||
copy: src=inputfile dest=/tmp/inputfile |
||||
|
||||
- name: upload the file |
||||
shell: su - hdfs -c "hadoop fs -put /tmp/inputfile /inputfile" |
||||
|
||||
- name: Run the MapReduce job to count the occurance of word hello |
||||
shell: su - hdfs -c "hadoop jar /usr/lib/hadoop-0.20-mapreduce/hadoop-examples.jar grep /inputfile /outputfile 'hello'" |
||||
|
||||
- name: Fetch the outputfile to local tmp dir |
||||
shell: su - hdfs -c "hadoop fs -get /outputfile /tmp/outputfile" |
||||
|
||||
- name: Get the outputfile to ansible server |
||||
fetch: dest=/tmp src=/tmp/outputfile/part-00000 |
||||
|
@ -0,0 +1,5 @@ |
||||
[cloudera-cdh4] |
||||
name=Cloudera's Distribution for Hadoop, Version 4 |
||||
baseurl=http://archive.cloudera.com/cdh4/redhat/6/x86_64/cdh/4/ |
||||
gpgkey = http://archive.cloudera.com/cdh4/redhat/6/x86_64/cdh/RPM-GPG-KEY-cloudera |
||||
gpgcheck = 1 |
@ -0,0 +1,2 @@ |
||||
- name: restart iptables |
||||
service: name=iptables state=restarted |
@ -0,0 +1,28 @@ |
||||
--- |
||||
# The playbook for common tasks |
||||
|
||||
- name: Deploy the Cloudera Repository |
||||
copy: src=etc/cloudera-CDH4.repo dest=/etc/yum.repos.d/cloudera-CDH4.repo |
||||
|
||||
- name: Install the openjdk package |
||||
yum: name=java-1.6.0-openjdk state=installed |
||||
|
||||
- name: Create a directory for java |
||||
file: state=directory path=/usr/java/ |
||||
|
||||
- name: Create a link for java |
||||
file: src=/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre state=link path=/usr/java/default |
||||
|
||||
- name: Create the hosts file for all machines |
||||
template: src=etc/hosts.j2 dest=/etc/hosts |
||||
|
||||
- name: Disable SELinux in conf file |
||||
lineinfile: dest=/etc/sysconfig/selinux regexp='^SELINUX=' line='SELINUX=disabled' state=present |
||||
|
||||
- name: Disable SELinux dynamically |
||||
shell: creates=/etc/sysconfig/selinux.disabled setenforce 0 ; touch /etc/sysconfig/selinux.disabled |
||||
|
||||
- name: Create the iptables file for all machines |
||||
template: src=iptables.j2 dest=/etc/sysconfig/iptables |
||||
notify: restart iptables |
||||
|
@ -0,0 +1,5 @@ |
||||
--- |
||||
# The playbook for common tasks |
||||
|
||||
- include: common.yml tags=slaves |
||||
|
@ -0,0 +1,5 @@ |
||||
127.0.0.1 localhost |
||||
{% for host in groups.all %} |
||||
{{ hostvars[host]['ansible_' + iface].ipv4.address }} {{ host }} |
||||
{% endfor %} |
||||
|
@ -0,0 +1,25 @@ |
||||
<?xml version="1.0"?> |
||||
<!-- |
||||
Licensed to the Apache Software Foundation (ASF) under one or more |
||||
contributor license agreements. See the NOTICE file distributed with |
||||
this work for additional information regarding copyright ownership. |
||||
The ASF licenses this file to You under the Apache License, Version 2.0 |
||||
(the "License"); you may not use this file except in compliance with |
||||
the License. You may obtain a copy of the License at |
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0 |
||||
|
||||
Unless required by applicable law or agreed to in writing, software |
||||
distributed under the License is distributed on an "AS IS" BASIS, |
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
||||
See the License for the specific language governing permissions and |
||||
limitations under the License. |
||||
--> |
||||
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?> |
||||
|
||||
<configuration> |
||||
<property> |
||||
<name>fs.defaultFS</name> |
||||
<value>hdfs://{{ hostvars[groups['hadoop_masters'][0]]['ansible_hostname'] + ':' ~ hadoop['fs_default_FS_port'] }}/</value> |
||||
</property> |
||||
</configuration> |
@ -0,0 +1,75 @@ |
||||
# Configuration of the "dfs" context for null |
||||
dfs.class=org.apache.hadoop.metrics.spi.NullContext |
||||
|
||||
# Configuration of the "dfs" context for file |
||||
#dfs.class=org.apache.hadoop.metrics.file.FileContext |
||||
#dfs.period=10 |
||||
#dfs.fileName=/tmp/dfsmetrics.log |
||||
|
||||
# Configuration of the "dfs" context for ganglia |
||||
# Pick one: Ganglia 3.0 (former) or Ganglia 3.1 (latter) |
||||
# dfs.class=org.apache.hadoop.metrics.ganglia.GangliaContext |
||||
# dfs.class=org.apache.hadoop.metrics.ganglia.GangliaContext31 |
||||
# dfs.period=10 |
||||
# dfs.servers=localhost:8649 |
||||
|
||||
|
||||
# Configuration of the "mapred" context for null |
||||
mapred.class=org.apache.hadoop.metrics.spi.NullContext |
||||
|
||||
# Configuration of the "mapred" context for file |
||||
#mapred.class=org.apache.hadoop.metrics.file.FileContext |
||||
#mapred.period=10 |
||||
#mapred.fileName=/tmp/mrmetrics.log |
||||
|
||||
# Configuration of the "mapred" context for ganglia |
||||
# Pick one: Ganglia 3.0 (former) or Ganglia 3.1 (latter) |
||||
# mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext |
||||
# mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext31 |
||||
# mapred.period=10 |
||||
# mapred.servers=localhost:8649 |
||||
|
||||
|
||||
# Configuration of the "jvm" context for null |
||||
#jvm.class=org.apache.hadoop.metrics.spi.NullContext |
||||
|
||||
# Configuration of the "jvm" context for file |
||||
#jvm.class=org.apache.hadoop.metrics.file.FileContext |
||||
#jvm.period=10 |
||||
#jvm.fileName=/tmp/jvmmetrics.log |
||||
|
||||
# Configuration of the "jvm" context for ganglia |
||||
# jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext |
||||
# jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext31 |
||||
# jvm.period=10 |
||||
# jvm.servers=localhost:8649 |
||||
|
||||
# Configuration of the "rpc" context for null |
||||
rpc.class=org.apache.hadoop.metrics.spi.NullContext |
||||
|
||||
# Configuration of the "rpc" context for file |
||||
#rpc.class=org.apache.hadoop.metrics.file.FileContext |
||||
#rpc.period=10 |
||||
#rpc.fileName=/tmp/rpcmetrics.log |
||||
|
||||
# Configuration of the "rpc" context for ganglia |
||||
# rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContext |
||||
# rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContext31 |
||||
# rpc.period=10 |
||||
# rpc.servers=localhost:8649 |
||||
|
||||
|
||||
# Configuration of the "ugi" context for null |
||||
ugi.class=org.apache.hadoop.metrics.spi.NullContext |
||||
|
||||
# Configuration of the "ugi" context for file |
||||
#ugi.class=org.apache.hadoop.metrics.file.FileContext |
||||
#ugi.period=10 |
||||
#ugi.fileName=/tmp/ugimetrics.log |
||||
|
||||
# Configuration of the "ugi" context for ganglia |
||||
# ugi.class=org.apache.hadoop.metrics.ganglia.GangliaContext |
||||
# ugi.class=org.apache.hadoop.metrics.ganglia.GangliaContext31 |
||||
# ugi.period=10 |
||||
# ugi.servers=localhost:8649 |
||||
|
@ -0,0 +1,44 @@ |
||||
# |
||||
# Licensed to the Apache Software Foundation (ASF) under one or more |
||||
# contributor license agreements. See the NOTICE file distributed with |
||||
# this work for additional information regarding copyright ownership. |
||||
# The ASF licenses this file to You under the Apache License, Version 2.0 |
||||
# (the "License"); you may not use this file except in compliance with |
||||
# the License. You may obtain a copy of the License at |
||||
# |
||||
# http://www.apache.org/licenses/LICENSE-2.0 |
||||
# |
||||
# Unless required by applicable law or agreed to in writing, software |
||||
# distributed under the License is distributed on an "AS IS" BASIS, |
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
||||
# See the License for the specific language governing permissions and |
||||
# limitations under the License. |
||||
# |
||||
|
||||
# syntax: [prefix].[source|sink].[instance].[options] |
||||
# See javadoc of package-info.java for org.apache.hadoop.metrics2 for details |
||||
|
||||
*.sink.file.class=org.apache.hadoop.metrics2.sink.FileSink |
||||
# default sampling period, in seconds |
||||
*.period=10 |
||||
|
||||
# The namenode-metrics.out will contain metrics from all context |
||||
#namenode.sink.file.filename=namenode-metrics.out |
||||
# Specifying a special sampling period for namenode: |
||||
#namenode.sink.*.period=8 |
||||
|
||||
#datanode.sink.file.filename=datanode-metrics.out |
||||
|
||||
# the following example split metrics of different |
||||
# context to different sinks (in this case files) |
||||
#jobtracker.sink.file_jvm.context=jvm |
||||
#jobtracker.sink.file_jvm.filename=jobtracker-jvm-metrics.out |
||||
#jobtracker.sink.file_mapred.context=mapred |
||||
#jobtracker.sink.file_mapred.filename=jobtracker-mapred-metrics.out |
||||
|
||||
#tasktracker.sink.file.filename=tasktracker-metrics.out |
||||
|
||||
#maptask.sink.file.filename=maptask-metrics.out |
||||
|
||||
#reducetask.sink.file.filename=reducetask-metrics.out |
||||
|
@ -0,0 +1,57 @@ |
||||
<?xml version="1.0"?> |
||||
<!-- |
||||
Licensed to the Apache Software Foundation (ASF) under one or more |
||||
contributor license agreements. See the NOTICE file distributed with |
||||
this work for additional information regarding copyright ownership. |
||||
The ASF licenses this file to You under the Apache License, Version 2.0 |
||||
(the "License"); you may not use this file except in compliance with |
||||
the License. You may obtain a copy of the License at |
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0 |
||||
|
||||
Unless required by applicable law or agreed to in writing, software |
||||
distributed under the License is distributed on an "AS IS" BASIS, |
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
||||
See the License for the specific language governing permissions and |
||||
limitations under the License. |
||||
--> |
||||
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?> |
||||
|
||||
<configuration> |
||||
<property> |
||||
<name>dfs.blocksize</name> |
||||
<value>{{ hadoop['dfs_blocksize'] }}</value> |
||||
</property> |
||||
<property> |
||||
<name>dfs.permissions.superusergroup</name> |
||||
<value>{{ hadoop['dfs_permissions_superusergroup'] }}</value> |
||||
</property> |
||||
<property> |
||||
<name>dfs.namenode.http.address</name> |
||||
<value>0.0.0.0:{{ hadoop['dfs_namenode_http_address_port'] }}</value> |
||||
</property> |
||||
<property> |
||||
<name>dfs.datanode.address</name> |
||||
<value>0.0.0.0:{{ hadoop['dfs_datanode_address_port'] }}</value> |
||||
</property> |
||||
<property> |
||||
<name>dfs.datanode.http.address</name> |
||||
<value>0.0.0.0:{{ hadoop['dfs_datanode_http_address_port'] }}</value> |
||||
</property> |
||||
<property> |
||||
<name>dfs.datanode.ipc.address</name> |
||||
<value>0.0.0.0:{{ hadoop['dfs_datanode_ipc_address_port'] }}</value> |
||||
</property> |
||||
<property> |
||||
<name>dfs.replication</name> |
||||
<value>{{ hadoop['dfs_replication'] }}</value> |
||||
</property> |
||||
<property> |
||||
<name>dfs.namenode.name.dir</name> |
||||
<value>{{ hadoop['dfs_namenode_name_dir'] | join(',') }}</value> |
||||
</property> |
||||
<property> |
||||
<name>dfs.datanode.data.dir</name> |
||||
<value>{{ hadoop['dfs_datanode_data_dir'] | join(',') }}</value> |
||||
</property> |
||||
</configuration> |
@ -0,0 +1,219 @@ |
||||
# Copyright 2011 The Apache Software Foundation |
||||
# |
||||
# Licensed to the Apache Software Foundation (ASF) under one |
||||
# or more contributor license agreements. See the NOTICE file |
||||
# distributed with this work for additional information |
||||
# regarding copyright ownership. The ASF licenses this file |
||||
# to you under the Apache License, Version 2.0 (the |
||||
# "License"); you may not use this file except in compliance |
||||
# with the License. You may obtain a copy of the License at |
||||
# |
||||
# http://www.apache.org/licenses/LICENSE-2.0 |
||||
# |
||||
# Unless required by applicable law or agreed to in writing, software |
||||
# distributed under the License is distributed on an "AS IS" BASIS, |
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
||||
# See the License for the specific language governing permissions and |
||||
# limitations under the License. |
||||
|
||||
# Define some default values that can be overridden by system properties |
||||
hadoop.root.logger=INFO,console |
||||
hadoop.log.dir=. |
||||
hadoop.log.file=hadoop.log |
||||
|
||||
# Define the root logger to the system property "hadoop.root.logger". |
||||
log4j.rootLogger=${hadoop.root.logger}, EventCounter |
||||
|
||||
# Logging Threshold |
||||
log4j.threshold=ALL |
||||
|
||||
# Null Appender |
||||
log4j.appender.NullAppender=org.apache.log4j.varia.NullAppender |
||||
|
||||
# |
||||
# Rolling File Appender - cap space usage at 5gb. |
||||
# |
||||
hadoop.log.maxfilesize=256MB |
||||
hadoop.log.maxbackupindex=20 |
||||
log4j.appender.RFA=org.apache.log4j.RollingFileAppender |
||||
log4j.appender.RFA.File=${hadoop.log.dir}/${hadoop.log.file} |
||||
|
||||
log4j.appender.RFA.MaxFileSize=${hadoop.log.maxfilesize} |
||||
log4j.appender.RFA.MaxBackupIndex=${hadoop.log.maxbackupindex} |
||||
|
||||
log4j.appender.RFA.layout=org.apache.log4j.PatternLayout |
||||
|
||||
# Pattern format: Date LogLevel LoggerName LogMessage |
||||
log4j.appender.RFA.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n |
||||
# Debugging Pattern format |
||||
#log4j.appender.RFA.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} (%F:%M(%L)) - %m%n |
||||
|
||||
|
||||
# |
||||
# Daily Rolling File Appender |
||||
# |
||||
|
||||
log4j.appender.DRFA=org.apache.log4j.DailyRollingFileAppender |
||||
log4j.appender.DRFA.File=${hadoop.log.dir}/${hadoop.log.file} |
||||
|
||||
# Rollver at midnight |
||||
log4j.appender.DRFA.DatePattern=.yyyy-MM-dd |
||||
|
||||
# 30-day backup |
||||
#log4j.appender.DRFA.MaxBackupIndex=30 |
||||
log4j.appender.DRFA.layout=org.apache.log4j.PatternLayout |
||||
|
||||
# Pattern format: Date LogLevel LoggerName LogMessage |
||||
log4j.appender.DRFA.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n |
||||
# Debugging Pattern format |
||||
#log4j.appender.DRFA.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} (%F:%M(%L)) - %m%n |
||||
|
||||
|
||||
# |
||||
# console |
||||
# Add "console" to rootlogger above if you want to use this |
||||
# |
||||
|
||||
log4j.appender.console=org.apache.log4j.ConsoleAppender |
||||
log4j.appender.console.target=System.err |
||||
log4j.appender.console.layout=org.apache.log4j.PatternLayout |
||||
log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{2}: %m%n |
||||
|
||||
# |
||||
# TaskLog Appender |
||||
# |
||||
|
||||
#Default values |
||||
hadoop.tasklog.taskid=null |
||||
hadoop.tasklog.iscleanup=false |
||||
hadoop.tasklog.noKeepSplits=4 |
||||
hadoop.tasklog.totalLogFileSize=100 |
||||
hadoop.tasklog.purgeLogSplits=true |
||||
hadoop.tasklog.logsRetainHours=12 |
||||
|
||||
log4j.appender.TLA=org.apache.hadoop.mapred.TaskLogAppender |
||||
log4j.appender.TLA.taskId=${hadoop.tasklog.taskid} |
||||
log4j.appender.TLA.isCleanup=${hadoop.tasklog.iscleanup} |
||||
log4j.appender.TLA.totalLogFileSize=${hadoop.tasklog.totalLogFileSize} |
||||
|
||||
log4j.appender.TLA.layout=org.apache.log4j.PatternLayout |
||||
log4j.appender.TLA.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n |
||||
|
||||
# |
||||
# HDFS block state change log from block manager |
||||
# |
||||
# Uncomment the following to suppress normal block state change |
||||
# messages from BlockManager in NameNode. |
||||
#log4j.logger.BlockStateChange=WARN |
||||
|
||||
# |
||||
#Security appender |
||||
# |
||||
hadoop.security.logger=INFO,NullAppender |
||||
hadoop.security.log.maxfilesize=256MB |
||||
hadoop.security.log.maxbackupindex=20 |
||||
log4j.category.SecurityLogger=${hadoop.security.logger} |
||||
hadoop.security.log.file=SecurityAuth-${user.name}.audit |
||||
log4j.appender.RFAS=org.apache.log4j.RollingFileAppender |
||||
log4j.appender.RFAS.File=${hadoop.log.dir}/${hadoop.security.log.file} |
||||
log4j.appender.RFAS.layout=org.apache.log4j.PatternLayout |
||||
log4j.appender.RFAS.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n |
||||
log4j.appender.RFAS.MaxFileSize=${hadoop.security.log.maxfilesize} |
||||
log4j.appender.RFAS.MaxBackupIndex=${hadoop.security.log.maxbackupindex} |
||||
|
||||
# |
||||
# Daily Rolling Security appender |
||||
# |
||||
log4j.appender.DRFAS=org.apache.log4j.DailyRollingFileAppender |
||||
log4j.appender.DRFAS.File=${hadoop.log.dir}/${hadoop.security.log.file} |
||||
log4j.appender.DRFAS.layout=org.apache.log4j.PatternLayout |
||||
log4j.appender.DRFAS.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n |
||||
log4j.appender.DRFAS.DatePattern=.yyyy-MM-dd |
||||
|
||||
# |
||||
# hdfs audit logging |
||||
# |
||||
hdfs.audit.logger=INFO,NullAppender |
||||
hdfs.audit.log.maxfilesize=256MB |
||||
hdfs.audit.log.maxbackupindex=20 |
||||
log4j.logger.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=${hdfs.audit.logger} |
||||
log4j.additivity.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=false |
||||
log4j.appender.RFAAUDIT=org.apache.log4j.RollingFileAppender |
||||
log4j.appender.RFAAUDIT.File=${hadoop.log.dir}/hdfs-audit.log |
||||
log4j.appender.RFAAUDIT.layout=org.apache.log4j.PatternLayout |
||||
log4j.appender.RFAAUDIT.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n |
||||
log4j.appender.RFAAUDIT.MaxFileSize=${hdfs.audit.log.maxfilesize} |
||||
log4j.appender.RFAAUDIT.MaxBackupIndex=${hdfs.audit.log.maxbackupindex} |
||||
|
||||
# |
||||
# mapred audit logging |
||||
# |
||||
mapred.audit.logger=INFO,NullAppender |
||||
mapred.audit.log.maxfilesize=256MB |
||||
mapred.audit.log.maxbackupindex=20 |
||||
log4j.logger.org.apache.hadoop.mapred.AuditLogger=${mapred.audit.logger} |
||||
log4j.additivity.org.apache.hadoop.mapred.AuditLogger=false |
||||
log4j.appender.MRAUDIT=org.apache.log4j.RollingFileAppender |
||||
log4j.appender.MRAUDIT.File=${hadoop.log.dir}/mapred-audit.log |
||||
log4j.appender.MRAUDIT.layout=org.apache.log4j.PatternLayout |
||||
log4j.appender.MRAUDIT.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n |
||||
log4j.appender.MRAUDIT.MaxFileSize=${mapred.audit.log.maxfilesize} |
||||
log4j.appender.MRAUDIT.MaxBackupIndex=${mapred.audit.log.maxbackupindex} |
||||
|
||||
# Custom Logging levels |
||||
|
||||
#log4j.logger.org.apache.hadoop.mapred.JobTracker=DEBUG |
||||
#log4j.logger.org.apache.hadoop.mapred.TaskTracker=DEBUG |
||||
#log4j.logger.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=DEBUG |
||||
|
||||
# Jets3t library |
||||
log4j.logger.org.jets3t.service.impl.rest.httpclient.RestS3Service=ERROR |
||||
|
||||
# |
||||
# Event Counter Appender |
||||
# Sends counts of logging messages at different severity levels to Hadoop Metrics. |
||||
# |
||||
log4j.appender.EventCounter=org.apache.hadoop.log.metrics.EventCounter |
||||
|
||||
# |
||||
# Job Summary Appender |
||||
# |
||||
# Use following logger to send summary to separate file defined by |
||||
# hadoop.mapreduce.jobsummary.log.file : |
||||
# hadoop.mapreduce.jobsummary.logger=INFO,JSA |
||||
# |
||||
hadoop.mapreduce.jobsummary.logger=${hadoop.root.logger} |
||||
hadoop.mapreduce.jobsummary.log.file=hadoop-mapreduce.jobsummary.log |
||||
hadoop.mapreduce.jobsummary.log.maxfilesize=256MB |
||||
hadoop.mapreduce.jobsummary.log.maxbackupindex=20 |
||||
log4j.appender.JSA=org.apache.log4j.RollingFileAppender |
||||
log4j.appender.JSA.File=${hadoop.log.dir}/${hadoop.mapreduce.jobsummary.log.file} |
||||
log4j.appender.JSA.MaxFileSize=${hadoop.mapreduce.jobsummary.log.maxfilesize} |
||||
log4j.appender.JSA.MaxBackupIndex=${hadoop.mapreduce.jobsummary.log.maxbackupindex} |
||||
log4j.appender.JSA.layout=org.apache.log4j.PatternLayout |
||||
log4j.appender.JSA.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{2}: %m%n |
||||
log4j.logger.org.apache.hadoop.mapred.JobInProgress$JobSummary=${hadoop.mapreduce.jobsummary.logger} |
||||
log4j.additivity.org.apache.hadoop.mapred.JobInProgress$JobSummary=false |
||||
|
||||
# |
||||
# Yarn ResourceManager Application Summary Log |
||||
# |
||||
# Set the ResourceManager summary log filename |
||||
#yarn.server.resourcemanager.appsummary.log.file=rm-appsummary.log |
||||
# Set the ResourceManager summary log level and appender |
||||
#yarn.server.resourcemanager.appsummary.logger=INFO,RMSUMMARY |
||||
|
||||
# Appender for ResourceManager Application Summary Log |
||||
# Requires the following properties to be set |
||||
# - hadoop.log.dir (Hadoop Log directory) |
||||
# - yarn.server.resourcemanager.appsummary.log.file (resource manager app summary log filename) |
||||
# - yarn.server.resourcemanager.appsummary.logger (resource manager app summary log level and appender) |
||||
|
||||
#log4j.logger.org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary=${yarn.server.resourcemanager.appsummary.logger} |
||||
#log4j.additivity.org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary=false |
||||
#log4j.appender.RMSUMMARY=org.apache.log4j.RollingFileAppender |
||||
#log4j.appender.RMSUMMARY.File=${hadoop.log.dir}/${yarn.server.resourcemanager.appsummary.log.file} |
||||
#log4j.appender.RMSUMMARY.MaxFileSize=256MB |
||||
#log4j.appender.RMSUMMARY.MaxBackupIndex=20 |
||||
#log4j.appender.RMSUMMARY.layout=org.apache.log4j.PatternLayout |
||||
#log4j.appender.RMSUMMARY.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n |
@ -0,0 +1,22 @@ |
||||
<configuration> |
||||
|
||||
<property> |
||||
<name>mapred.job.tracker</name> |
||||
<value>{{ hostvars[groups['hadoop_masters'][0]]['ansible_hostname'] }}:{{ hadoop['mapred_job_tracker_port'] }}</value> |
||||
</property> |
||||
|
||||
<property> |
||||
<name>mapred.local.dir</name> |
||||
<value>{{ hadoop["mapred_local_dir"] | join(',') }}</value> |
||||
</property> |
||||
|
||||
<property> |
||||
<name>mapred.task.tracker.http.address</name> |
||||
<value>0.0.0.0:{{ hadoop['mapred_task_tracker_http_address_port'] }}</value> |
||||
</property> |
||||
<property> |
||||
<name>mapred.job.tracker.http.address</name> |
||||
<value>0.0.0.0:{{ hadoop['mapred_job_tracker_http_address_port'] }}</value> |
||||
</property> |
||||
|
||||
</configuration> |
@ -0,0 +1,3 @@ |
||||
{% for host in groups['hadoop_slaves'] %} |
||||
{{ host }} |
||||
{% endfor %} |
@ -0,0 +1,80 @@ |
||||
<?xml version="1.0"?> |
||||
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?> |
||||
<!-- |
||||
Licensed to the Apache Software Foundation (ASF) under one or more |
||||
contributor license agreements. See the NOTICE file distributed with |
||||
this work for additional information regarding copyright ownership. |
||||
The ASF licenses this file to You under the Apache License, Version 2.0 |
||||
(the "License"); you may not use this file except in compliance with |
||||
the License. You may obtain a copy of the License at |
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0 |
||||
|
||||
Unless required by applicable law or agreed to in writing, software |
||||
distributed under the License is distributed on an "AS IS" BASIS, |
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
||||
See the License for the specific language governing permissions and |
||||
limitations under the License. |
||||
--> |
||||
<configuration> |
||||
|
||||
<property> |
||||
<name>ssl.client.truststore.location</name> |
||||
<value></value> |
||||
<description>Truststore to be used by clients like distcp. Must be |
||||
specified. |
||||
</description> |
||||
</property> |
||||
|
||||
<property> |
||||
<name>ssl.client.truststore.password</name> |
||||
<value></value> |
||||
<description>Optional. Default value is "". |
||||
</description> |
||||
</property> |
||||
|
||||
<property> |
||||
<name>ssl.client.truststore.type</name> |
||||
<value>jks</value> |
||||
<description>Optional. The keystore file format, default value is "jks". |
||||
</description> |
||||
</property> |
||||
|
||||
<property> |
||||
<name>ssl.client.truststore.reload.interval</name> |
||||
<value>10000</value> |
||||
<description>Truststore reload check interval, in milliseconds. |
||||
Default value is 10000 (10 seconds). |
||||
</description> |
||||
</property> |
||||
|
||||
<property> |
||||
<name>ssl.client.keystore.location</name> |
||||
<value></value> |
||||
<description>Keystore to be used by clients like distcp. Must be |
||||
specified. |
||||
</description> |
||||
</property> |
||||
|
||||
<property> |
||||
<name>ssl.client.keystore.password</name> |
||||
<value></value> |
||||
<description>Optional. Default value is "". |
||||
</description> |
||||
</property> |
||||
|
||||
<property> |
||||
<name>ssl.client.keystore.keypassword</name> |
||||
<value></value> |
||||
<description>Optional. Default value is "". |
||||
</description> |
||||
</property> |
||||
|
||||
<property> |
||||
<name>ssl.client.keystore.type</name> |
||||
<value>jks</value> |
||||
<description>Optional. The keystore file format, default value is "jks". |
||||
</description> |
||||
</property> |
||||
|
||||
</configuration> |
@ -0,0 +1,77 @@ |
||||
<?xml version="1.0"?> |
||||
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?> |
||||
<!-- |
||||
Licensed to the Apache Software Foundation (ASF) under one or more |
||||
contributor license agreements. See the NOTICE file distributed with |
||||
this work for additional information regarding copyright ownership. |
||||
The ASF licenses this file to You under the Apache License, Version 2.0 |
||||
(the "License"); you may not use this file except in compliance with |
||||
the License. You may obtain a copy of the License at |
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0 |
||||
|
||||
Unless required by applicable law or agreed to in writing, software |
||||
distributed under the License is distributed on an "AS IS" BASIS, |
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
||||
See the License for the specific language governing permissions and |
||||
limitations under the License. |
||||
--> |
||||
<configuration> |
||||
|
||||
<property> |
||||
<name>ssl.server.truststore.location</name> |
||||
<value></value> |
||||
<description>Truststore to be used by NN and DN. Must be specified. |
||||
</description> |
||||
</property> |
||||
|
||||
<property> |
||||
<name>ssl.server.truststore.password</name> |
||||
<value></value> |
||||
<description>Optional. Default value is "". |
||||
</description> |
||||
</property> |
||||
|
||||
<property> |
||||
<name>ssl.server.truststore.type</name> |
||||
<value>jks</value> |
||||
<description>Optional. The keystore file format, default value is "jks". |
||||
</description> |
||||
</property> |
||||
|
||||
<property> |
||||
<name>ssl.server.truststore.reload.interval</name> |
||||
<value>10000</value> |
||||
<description>Truststore reload check interval, in milliseconds. |
||||
Default value is 10000 (10 seconds). |
||||
</property> |
||||
|
||||
<property> |
||||
<name>ssl.server.keystore.location</name> |
||||
<value></value> |
||||
<description>Keystore to be used by NN and DN. Must be specified. |
||||
</description> |
||||
</property> |
||||
|
||||
<property> |
||||
<name>ssl.server.keystore.password</name> |
||||
<value></value> |
||||
<description>Must be specified. |
||||
</description> |
||||
</property> |
||||
|
||||
<property> |
||||
<name>ssl.server.keystore.keypassword</name> |
||||
<value></value> |
||||
<description>Must be specified. |
||||
</description> |
||||
</property> |
||||
|
||||
<property> |
||||
<name>ssl.server.keystore.type</name> |
||||
<value>jks</value> |
||||
<description>Optional. The keystore file format, default value is "jks". |
||||
</description> |
||||
</property> |
||||
|
||||
</configuration> |
@ -0,0 +1,25 @@ |
||||
<?xml version="1.0"?> |
||||
<!-- |
||||
Licensed to the Apache Software Foundation (ASF) under one or more |
||||
contributor license agreements. See the NOTICE file distributed with |
||||
this work for additional information regarding copyright ownership. |
||||
The ASF licenses this file to You under the Apache License, Version 2.0 |
||||
(the "License"); you may not use this file except in compliance with |
||||
the License. You may obtain a copy of the License at |
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0 |
||||
|
||||
Unless required by applicable law or agreed to in writing, software |
||||
distributed under the License is distributed on an "AS IS" BASIS, |
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
||||
See the License for the specific language governing permissions and |
||||
limitations under the License. |
||||
--> |
||||
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?> |
||||
|
||||
<configuration> |
||||
<property> |
||||
<name>fs.defaultFS</name> |
||||
<value>hdfs://{{ hadoop['nameservice_id'] }}/</value> |
||||
</property> |
||||
</configuration> |
@ -0,0 +1,75 @@ |
||||
# Configuration of the "dfs" context for null |
||||
dfs.class=org.apache.hadoop.metrics.spi.NullContext |
||||
|
||||
# Configuration of the "dfs" context for file |
||||
#dfs.class=org.apache.hadoop.metrics.file.FileContext |
||||
#dfs.period=10 |
||||
#dfs.fileName=/tmp/dfsmetrics.log |
||||
|
||||
# Configuration of the "dfs" context for ganglia |
||||
# Pick one: Ganglia 3.0 (former) or Ganglia 3.1 (latter) |
||||
# dfs.class=org.apache.hadoop.metrics.ganglia.GangliaContext |
||||
# dfs.class=org.apache.hadoop.metrics.ganglia.GangliaContext31 |
||||
# dfs.period=10 |
||||
# dfs.servers=localhost:8649 |
||||
|
||||
|
||||
# Configuration of the "mapred" context for null |
||||
mapred.class=org.apache.hadoop.metrics.spi.NullContext |
||||
|
||||
# Configuration of the "mapred" context for file |
||||
#mapred.class=org.apache.hadoop.metrics.file.FileContext |
||||
#mapred.period=10 |
||||
#mapred.fileName=/tmp/mrmetrics.log |
||||
|
||||
# Configuration of the "mapred" context for ganglia |
||||
# Pick one: Ganglia 3.0 (former) or Ganglia 3.1 (latter) |
||||
# mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext |
||||
# mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext31 |
||||
# mapred.period=10 |
||||
# mapred.servers=localhost:8649 |
||||
|
||||
|
||||
# Configuration of the "jvm" context for null |
||||
#jvm.class=org.apache.hadoop.metrics.spi.NullContext |
||||
|
||||
# Configuration of the "jvm" context for file |
||||
#jvm.class=org.apache.hadoop.metrics.file.FileContext |
||||
#jvm.period=10 |
||||
#jvm.fileName=/tmp/jvmmetrics.log |
||||
|
||||
# Configuration of the "jvm" context for ganglia |
||||
# jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext |
||||
# jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext31 |
||||
# jvm.period=10 |
||||
# jvm.servers=localhost:8649 |
||||
|
||||
# Configuration of the "rpc" context for null |
||||
rpc.class=org.apache.hadoop.metrics.spi.NullContext |
||||
|
||||
# Configuration of the "rpc" context for file |
||||
#rpc.class=org.apache.hadoop.metrics.file.FileContext |
||||
#rpc.period=10 |
||||
#rpc.fileName=/tmp/rpcmetrics.log |
||||
|
||||
# Configuration of the "rpc" context for ganglia |
||||
# rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContext |
||||
# rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContext31 |
||||
# rpc.period=10 |
||||
# rpc.servers=localhost:8649 |
||||
|
||||
|
||||
# Configuration of the "ugi" context for null |
||||
ugi.class=org.apache.hadoop.metrics.spi.NullContext |
||||
|
||||
# Configuration of the "ugi" context for file |
||||
#ugi.class=org.apache.hadoop.metrics.file.FileContext |
||||
#ugi.period=10 |
||||
#ugi.fileName=/tmp/ugimetrics.log |
||||
|
||||
# Configuration of the "ugi" context for ganglia |
||||
# ugi.class=org.apache.hadoop.metrics.ganglia.GangliaContext |
||||
# ugi.class=org.apache.hadoop.metrics.ganglia.GangliaContext31 |
||||
# ugi.period=10 |
||||
# ugi.servers=localhost:8649 |
||||
|
@ -0,0 +1,44 @@ |
||||
# |
||||
# Licensed to the Apache Software Foundation (ASF) under one or more |
||||
# contributor license agreements. See the NOTICE file distributed with |
||||
# this work for additional information regarding copyright ownership. |
||||
# The ASF licenses this file to You under the Apache License, Version 2.0 |
||||
# (the "License"); you may not use this file except in compliance with |
||||
# the License. You may obtain a copy of the License at |
||||
# |
||||
# http://www.apache.org/licenses/LICENSE-2.0 |
||||
# |
||||
# Unless required by applicable law or agreed to in writing, software |
||||
# distributed under the License is distributed on an "AS IS" BASIS, |
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
||||
# See the License for the specific language governing permissions and |
||||
# limitations under the License. |
||||
# |
||||
|
||||
# syntax: [prefix].[source|sink].[instance].[options] |
||||
# See javadoc of package-info.java for org.apache.hadoop.metrics2 for details |
||||
|
||||
*.sink.file.class=org.apache.hadoop.metrics2.sink.FileSink |
||||
# default sampling period, in seconds |
||||
*.period=10 |
||||
|
||||
# The namenode-metrics.out will contain metrics from all context |
||||
#namenode.sink.file.filename=namenode-metrics.out |
||||
# Specifying a special sampling period for namenode: |
||||
#namenode.sink.*.period=8 |
||||
|
||||
#datanode.sink.file.filename=datanode-metrics.out |
||||
|
||||
# the following example split metrics of different |
||||
# context to different sinks (in this case files) |
||||
#jobtracker.sink.file_jvm.context=jvm |
||||
#jobtracker.sink.file_jvm.filename=jobtracker-jvm-metrics.out |
||||
#jobtracker.sink.file_mapred.context=mapred |
||||
#jobtracker.sink.file_mapred.filename=jobtracker-mapred-metrics.out |
||||
|
||||
#tasktracker.sink.file.filename=tasktracker-metrics.out |
||||
|
||||
#maptask.sink.file.filename=maptask-metrics.out |
||||
|
||||
#reducetask.sink.file.filename=reducetask-metrics.out |
||||
|
@ -0,0 +1,103 @@ |
||||
<?xml version="1.0"?> |
||||
<!-- |
||||
Licensed to the Apache Software Foundation (ASF) under one or more |
||||
contributor license agreements. See the NOTICE file distributed with |
||||
this work for additional information regarding copyright ownership. |
||||
The ASF licenses this file to You under the Apache License, Version 2.0 |
||||
(the "License"); you may not use this file except in compliance with |
||||
the License. You may obtain a copy of the License at |
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0 |
||||
|
||||
Unless required by applicable law or agreed to in writing, software |
||||
distributed under the License is distributed on an "AS IS" BASIS, |
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
||||
See the License for the specific language governing permissions and |
||||
limitations under the License. |
||||
--> |
||||
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?> |
||||
<configuration> |
||||
<property> |
||||
<name>dfs.nameservices</name> |
||||
<value>{{ hadoop['nameservice_id'] }}</value> |
||||
</property> |
||||
<property> |
||||
<name>dfs.ha.namenodes.{{ hadoop['nameservice_id'] }}</name> |
||||
<value>{{ groups.hadoop_masters | join(',') }}</value> |
||||
</property> |
||||
<property> |
||||
<name>dfs.blocksize</name> |
||||
<value>{{ hadoop['dfs_blocksize'] }}</value> |
||||
</property> |
||||
<property> |
||||
<name>dfs.permissions.superusergroup</name> |
||||
<value>{{ hadoop['dfs_permissions_superusergroup'] }}</value> |
||||
</property> |
||||
<property> |
||||
<name>dfs.ha.automatic-failover.enabled</name> |
||||
<value>true</value> |
||||
</property> |
||||
<property> |
||||
<name>ha.zookeeper.quorum</name> |
||||
<value>{{ groups.zookeeper_servers | join(':' ~ hadoop['zookeeper_clientport'] + ',') }}:{{ hadoop['zookeeper_clientport'] }}</value> |
||||
</property> |
||||
|
||||
{% for host in groups['hadoop_masters'] %} |
||||
<property> |
||||
<name>dfs.namenode.rpc-address.{{ hadoop['nameservice_id'] }}.{{ host }}</name> |
||||
<value>{{ host }}:{{ hadoop['fs_default_FS_port'] }}</value> |
||||
</property> |
||||
{% endfor %} |
||||
{% for host in groups['hadoop_masters'] %} |
||||
<property> |
||||
<name>dfs.namenode.http-address.{{ hadoop['nameservice_id'] }}.{{ host }}</name> |
||||
<value>{{ host }}:{{ hadoop['dfs_namenode_http_address_port'] }}</value> |
||||
</property> |
||||
{% endfor %} |
||||
<property> |
||||
<name>dfs.namenode.shared.edits.dir</name> |
||||
<value>qjournal://{{ groups.qjournal_servers | join(':' ~ hadoop['qjournal_port'] + ';') }}:{{ hadoop['qjournal_port'] }}/{{ hadoop['nameservice_id'] }}</value> |
||||
</property> |
||||
<property> |
||||
<name>dfs.journalnode.edits.dir</name> |
||||
<value>{{ hadoop['dfs_journalnode_edits_dir'] }}</value> |
||||
</property> |
||||
<property> |
||||
<name>dfs.client.failover.proxy.provider.{{ hadoop['nameservice_id'] }}</name> |
||||
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> |
||||
</property> |
||||
<property> |
||||
<name>dfs.ha.fencing.methods</name> |
||||
<value>shell(/bin/true )</value> |
||||
</property> |
||||
|
||||
<property> |
||||
<name>dfs.ha.zkfc.port</name> |
||||
<value>{{ hadoop['dfs_ha_zkfc_port'] }}</value> |
||||
</property> |
||||
|
||||
<property> |
||||
<name>dfs.datanode.address</name> |
||||
<value>0.0.0.0:{{ hadoop['dfs_datanode_address_port'] }}</value> |
||||
</property> |
||||
<property> |
||||
<name>dfs.datanode.http.address</name> |
||||
<value>0.0.0.0:{{ hadoop['dfs_datanode_http_address_port'] }}</value> |
||||
</property> |
||||
<property> |
||||
<name>dfs.datanode.ipc.address</name> |
||||
<value>0.0.0.0:{{ hadoop['dfs_datanode_ipc_address_port'] }}</value> |
||||
</property> |
||||
<property> |
||||
<name>dfs.replication</name> |
||||
<value>{{ hadoop['dfs_replication'] }}</value> |
||||
</property> |
||||
<property> |
||||
<name>dfs.namenode.name.dir</name> |
||||
<value>{{ hadoop['dfs_namenode_name_dir'] | join(',') }}</value> |
||||
</property> |
||||
<property> |
||||
<name>dfs.datanode.data.dir</name> |
||||
<value>{{ hadoop['dfs_datanode_data_dir'] | join(',') }}</value> |
||||
</property> |
||||
</configuration> |
@ -0,0 +1,219 @@ |
||||
# Copyright 2011 The Apache Software Foundation |
||||
# |
||||
# Licensed to the Apache Software Foundation (ASF) under one |
||||
# or more contributor license agreements. See the NOTICE file |
||||
# distributed with this work for additional information |
||||
# regarding copyright ownership. The ASF licenses this file |
||||
# to you under the Apache License, Version 2.0 (the |
||||
# "License"); you may not use this file except in compliance |
||||
# with the License. You may obtain a copy of the License at |
||||
# |
||||
# http://www.apache.org/licenses/LICENSE-2.0 |
||||
# |
||||
# Unless required by applicable law or agreed to in writing, software |
||||
# distributed under the License is distributed on an "AS IS" BASIS, |
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
||||
# See the License for the specific language governing permissions and |
||||
# limitations under the License. |
||||
|
||||
# Define some default values that can be overridden by system properties |
||||
hadoop.root.logger=INFO,console |
||||
hadoop.log.dir=. |
||||
hadoop.log.file=hadoop.log |
||||
|
||||
# Define the root logger to the system property "hadoop.root.logger". |
||||
log4j.rootLogger=${hadoop.root.logger}, EventCounter |
||||
|
||||
# Logging Threshold |
||||
log4j.threshold=ALL |
||||
|
||||
# Null Appender |
||||
log4j.appender.NullAppender=org.apache.log4j.varia.NullAppender |
||||
|
||||
# |
||||
# Rolling File Appender - cap space usage at 5gb. |
||||
# |
||||
hadoop.log.maxfilesize=256MB |
||||
hadoop.log.maxbackupindex=20 |
||||
log4j.appender.RFA=org.apache.log4j.RollingFileAppender |
||||
log4j.appender.RFA.File=${hadoop.log.dir}/${hadoop.log.file} |
||||
|
||||
log4j.appender.RFA.MaxFileSize=${hadoop.log.maxfilesize} |
||||
log4j.appender.RFA.MaxBackupIndex=${hadoop.log.maxbackupindex} |
||||
|
||||
log4j.appender.RFA.layout=org.apache.log4j.PatternLayout |
||||
|
||||
# Pattern format: Date LogLevel LoggerName LogMessage |
||||
log4j.appender.RFA.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n |
||||
# Debugging Pattern format |
||||
#log4j.appender.RFA.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} (%F:%M(%L)) - %m%n |
||||
|
||||
|
||||
# |
||||
# Daily Rolling File Appender |
||||
# |
||||
|
||||
log4j.appender.DRFA=org.apache.log4j.DailyRollingFileAppender |
||||
log4j.appender.DRFA.File=${hadoop.log.dir}/${hadoop.log.file} |
||||
|
||||
# Rollver at midnight |
||||
log4j.appender.DRFA.DatePattern=.yyyy-MM-dd |
||||
|
||||
# 30-day backup |
||||
#log4j.appender.DRFA.MaxBackupIndex=30 |
||||
log4j.appender.DRFA.layout=org.apache.log4j.PatternLayout |
||||
|
||||
# Pattern format: Date LogLevel LoggerName LogMessage |
||||
log4j.appender.DRFA.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n |
||||
# Debugging Pattern format |
||||
#log4j.appender.DRFA.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} (%F:%M(%L)) - %m%n |
||||
|
||||
|
||||
# |
||||
# console |
||||
# Add "console" to rootlogger above if you want to use this |
||||
# |
||||
|
||||
log4j.appender.console=org.apache.log4j.ConsoleAppender |
||||
log4j.appender.console.target=System.err |
||||
log4j.appender.console.layout=org.apache.log4j.PatternLayout |
||||
log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{2}: %m%n |
||||
|
||||
# |
||||
# TaskLog Appender |
||||
# |
||||
|
||||
#Default values |
||||
hadoop.tasklog.taskid=null |
||||
hadoop.tasklog.iscleanup=false |
||||
hadoop.tasklog.noKeepSplits=4 |
||||
hadoop.tasklog.totalLogFileSize=100 |
||||
hadoop.tasklog.purgeLogSplits=true |
||||
hadoop.tasklog.logsRetainHours=12 |
||||
|
||||
log4j.appender.TLA=org.apache.hadoop.mapred.TaskLogAppender |
||||
log4j.appender.TLA.taskId=${hadoop.tasklog.taskid} |
||||
log4j.appender.TLA.isCleanup=${hadoop.tasklog.iscleanup} |
||||
log4j.appender.TLA.totalLogFileSize=${hadoop.tasklog.totalLogFileSize} |
||||
|
||||
log4j.appender.TLA.layout=org.apache.log4j.PatternLayout |
||||
log4j.appender.TLA.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n |
||||
|
||||
# |
||||
# HDFS block state change log from block manager |
||||
# |
||||
# Uncomment the following to suppress normal block state change |
||||
# messages from BlockManager in NameNode. |
||||
#log4j.logger.BlockStateChange=WARN |
||||
|
||||
# |
||||
#Security appender |
||||
# |
||||
hadoop.security.logger=INFO,NullAppender |
||||
hadoop.security.log.maxfilesize=256MB |
||||
hadoop.security.log.maxbackupindex=20 |
||||
log4j.category.SecurityLogger=${hadoop.security.logger} |
||||
hadoop.security.log.file=SecurityAuth-${user.name}.audit |
||||
log4j.appender.RFAS=org.apache.log4j.RollingFileAppender |
||||
log4j.appender.RFAS.File=${hadoop.log.dir}/${hadoop.security.log.file} |
||||
log4j.appender.RFAS.layout=org.apache.log4j.PatternLayout |
||||
log4j.appender.RFAS.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n |
||||
log4j.appender.RFAS.MaxFileSize=${hadoop.security.log.maxfilesize} |
||||
log4j.appender.RFAS.MaxBackupIndex=${hadoop.security.log.maxbackupindex} |
||||
|
||||
# |
||||
# Daily Rolling Security appender |
||||
# |
||||
log4j.appender.DRFAS=org.apache.log4j.DailyRollingFileAppender |
||||
log4j.appender.DRFAS.File=${hadoop.log.dir}/${hadoop.security.log.file} |
||||
log4j.appender.DRFAS.layout=org.apache.log4j.PatternLayout |
||||
log4j.appender.DRFAS.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n |
||||
log4j.appender.DRFAS.DatePattern=.yyyy-MM-dd |
||||
|
||||
# |
||||
# hdfs audit logging |
||||
# |
||||
hdfs.audit.logger=INFO,NullAppender |
||||
hdfs.audit.log.maxfilesize=256MB |
||||
hdfs.audit.log.maxbackupindex=20 |
||||
log4j.logger.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=${hdfs.audit.logger} |
||||
log4j.additivity.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=false |
||||
log4j.appender.RFAAUDIT=org.apache.log4j.RollingFileAppender |
||||
log4j.appender.RFAAUDIT.File=${hadoop.log.dir}/hdfs-audit.log |
||||
log4j.appender.RFAAUDIT.layout=org.apache.log4j.PatternLayout |
||||
log4j.appender.RFAAUDIT.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n |
||||
log4j.appender.RFAAUDIT.MaxFileSize=${hdfs.audit.log.maxfilesize} |
||||
log4j.appender.RFAAUDIT.MaxBackupIndex=${hdfs.audit.log.maxbackupindex} |
||||
|
||||
# |
||||
# mapred audit logging |
||||
# |
||||
mapred.audit.logger=INFO,NullAppender |
||||
mapred.audit.log.maxfilesize=256MB |
||||
mapred.audit.log.maxbackupindex=20 |
||||
log4j.logger.org.apache.hadoop.mapred.AuditLogger=${mapred.audit.logger} |
||||
log4j.additivity.org.apache.hadoop.mapred.AuditLogger=false |
||||
log4j.appender.MRAUDIT=org.apache.log4j.RollingFileAppender |
||||
log4j.appender.MRAUDIT.File=${hadoop.log.dir}/mapred-audit.log |
||||
log4j.appender.MRAUDIT.layout=org.apache.log4j.PatternLayout |
||||
log4j.appender.MRAUDIT.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n |
||||
log4j.appender.MRAUDIT.MaxFileSize=${mapred.audit.log.maxfilesize} |
||||
log4j.appender.MRAUDIT.MaxBackupIndex=${mapred.audit.log.maxbackupindex} |
||||
|
||||
# Custom Logging levels |
||||
|
||||
#log4j.logger.org.apache.hadoop.mapred.JobTracker=DEBUG |
||||
#log4j.logger.org.apache.hadoop.mapred.TaskTracker=DEBUG |
||||
#log4j.logger.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=DEBUG |
||||
|
||||
# Jets3t library |
||||
log4j.logger.org.jets3t.service.impl.rest.httpclient.RestS3Service=ERROR |
||||
|
||||
# |
||||
# Event Counter Appender |
||||
# Sends counts of logging messages at different severity levels to Hadoop Metrics. |
||||
# |
||||
log4j.appender.EventCounter=org.apache.hadoop.log.metrics.EventCounter |
||||
|
||||
# |
||||
# Job Summary Appender |
||||
# |
||||
# Use following logger to send summary to separate file defined by |
||||
# hadoop.mapreduce.jobsummary.log.file : |
||||
# hadoop.mapreduce.jobsummary.logger=INFO,JSA |
||||
# |
||||
hadoop.mapreduce.jobsummary.logger=${hadoop.root.logger} |
||||
hadoop.mapreduce.jobsummary.log.file=hadoop-mapreduce.jobsummary.log |
||||
hadoop.mapreduce.jobsummary.log.maxfilesize=256MB |
||||
hadoop.mapreduce.jobsummary.log.maxbackupindex=20 |
||||
log4j.appender.JSA=org.apache.log4j.RollingFileAppender |
||||
log4j.appender.JSA.File=${hadoop.log.dir}/${hadoop.mapreduce.jobsummary.log.file} |
||||
log4j.appender.JSA.MaxFileSize=${hadoop.mapreduce.jobsummary.log.maxfilesize} |
||||
log4j.appender.JSA.MaxBackupIndex=${hadoop.mapreduce.jobsummary.log.maxbackupindex} |
||||
log4j.appender.JSA.layout=org.apache.log4j.PatternLayout |
||||
log4j.appender.JSA.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{2}: %m%n |
||||
log4j.logger.org.apache.hadoop.mapred.JobInProgress$JobSummary=${hadoop.mapreduce.jobsummary.logger} |
||||
log4j.additivity.org.apache.hadoop.mapred.JobInProgress$JobSummary=false |
||||
|
||||
# |
||||
# Yarn ResourceManager Application Summary Log |
||||
# |
||||
# Set the ResourceManager summary log filename |
||||
#yarn.server.resourcemanager.appsummary.log.file=rm-appsummary.log |
||||
# Set the ResourceManager summary log level and appender |
||||
#yarn.server.resourcemanager.appsummary.logger=INFO,RMSUMMARY |
||||
|
||||
# Appender for ResourceManager Application Summary Log |
||||
# Requires the following properties to be set |
||||
# - hadoop.log.dir (Hadoop Log directory) |
||||
# - yarn.server.resourcemanager.appsummary.log.file (resource manager app summary log filename) |
||||
# - yarn.server.resourcemanager.appsummary.logger (resource manager app summary log level and appender) |
||||
|
||||
#log4j.logger.org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary=${yarn.server.resourcemanager.appsummary.logger} |
||||
#log4j.additivity.org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary=false |
||||
#log4j.appender.RMSUMMARY=org.apache.log4j.RollingFileAppender |
||||
#log4j.appender.RMSUMMARY.File=${hadoop.log.dir}/${yarn.server.resourcemanager.appsummary.log.file} |
||||
#log4j.appender.RMSUMMARY.MaxFileSize=256MB |
||||
#log4j.appender.RMSUMMARY.MaxBackupIndex=20 |
||||
#log4j.appender.RMSUMMARY.layout=org.apache.log4j.PatternLayout |
||||
#log4j.appender.RMSUMMARY.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n |
@ -0,0 +1,120 @@ |
||||
<configuration> |
||||
|
||||
<property> |
||||
<name>mapred.job.tracker</name> |
||||
<value>{{ hadoop['mapred_job_tracker_ha_servicename'] }}</value> |
||||
</property> |
||||
|
||||
<property> |
||||
<name>mapred.jobtrackers.{{ hadoop['mapred_job_tracker_ha_servicename'] }}</name> |
||||
<value>{{ groups['hadoop_masters'] | join(',') }}</value> |
||||
<description>Comma-separated list of JobTracker IDs.</description> |
||||
</property> |
||||
|
||||
<property> |
||||
<name>mapred.ha.automatic-failover.enabled</name> |
||||
<value>true</value> |
||||
</property> |
||||
|
||||
<property> |
||||
<name>mapred.ha.zkfc.port</name> |
||||
<value>{{ hadoop['mapred_ha_zkfc_port'] }}</value> |
||||
</property> |
||||
|
||||
<property> |
||||
<name>mapred.ha.fencing.methods</name> |
||||
<value>shell(/bin/true)</value> |
||||
</property> |
||||
|
||||
<property> |
||||
<name>ha.zookeeper.quorum</name> |
||||
<value>{{ groups.zookeeper_servers | join(':' ~ hadoop['zookeeper_clientport'] + ',') }}:{{ hadoop['zookeeper_clientport'] }}</value> |
||||
</property> |
||||
|
||||
{% for host in groups['hadoop_masters'] %} |
||||
<property> |
||||
<name>mapred.jobtracker.rpc-address.{{ hadoop['mapred_job_tracker_ha_servicename'] }}.{{ host }}</name> |
||||
<value>{{ host }}:{{ hadoop['mapred_job_tracker_port'] }}</value> |
||||
</property> |
||||
{% endfor %} |
||||
{% for host in groups['hadoop_masters'] %} |
||||
<property> |
||||
<name>mapred.job.tracker.http.address.{{ hadoop['mapred_job_tracker_ha_servicename'] }}.{{ host }}</name> |
||||
<value>0.0.0.0:{{ hadoop['mapred_job_tracker_http_address_port'] }}</value> |
||||
</property> |
||||
{% endfor %} |
||||
{% for host in groups['hadoop_masters'] %} |
||||
<property> |
||||
<name>mapred.ha.jobtracker.rpc-address.{{ hadoop['mapred_job_tracker_ha_servicename'] }}.{{ host }}</name> |
||||
<value>{{ host }}:{{ hadoop['mapred_ha_jobtracker_rpc-address_port'] }}</value> |
||||
</property> |
||||
{% endfor %} |
||||
{% for host in groups['hadoop_masters'] %} |
||||
<property> |
||||
<name>mapred.ha.jobtracker.http-redirect-address.{{ hadoop['mapred_job_tracker_ha_servicename'] }}.{{ host }}</name> |
||||
<value>{{ host }}:{{ hadoop['mapred_job_tracker_http_address_port'] }}</value> |
||||
</property> |
||||
{% endfor %} |
||||
|
||||
<property> |
||||
<name>mapred.jobtracker.restart.recover</name> |
||||
<value>true</value> |
||||
</property> |
||||
|
||||
<property> |
||||
<name>mapred.job.tracker.persist.jobstatus.active</name> |
||||
<value>true</value> |
||||
</property> |
||||
|
||||
<property> |
||||
<name>mapred.job.tracker.persist.jobstatus.hours</name> |
||||
<value>1</value> |
||||
</property> |
||||
|
||||
<property> |
||||
<name>mapred.job.tracker.persist.jobstatus.dir</name> |
||||
<value>{{ hadoop['mapred_job_tracker_persist_jobstatus_dir'] }}</value> |
||||
</property> |
||||
|
||||
<property> |
||||
<name>mapred.client.failover.proxy.provider.{{ hadoop['mapred_job_tracker_ha_servicename'] }}</name> |
||||
<value>org.apache.hadoop.mapred.ConfiguredFailoverProxyProvider</value> |
||||
</property> |
||||
|
||||
<property> |
||||
<name>mapred.client.failover.max.attempts</name> |
||||
<value>15</value> |
||||
</property> |
||||
|
||||
<property> |
||||
<name>mapred.client.failover.sleep.base.millis</name> |
||||
<value>500</value> |
||||
</property> |
||||
|
||||
<property> |
||||
<name>mapred.client.failover.sleep.max.millis</name> |
||||
<value>1500</value> |
||||
</property> |
||||
|
||||
<property> |
||||
<name>mapred.client.failover.connection.retries</name> |
||||
<value>0</value> |
||||
</property> |
||||
|
||||
<property> |
||||
<name>mapred.client.failover.connection.retries.on.timeouts</name> |
||||
<value>0</value> |
||||
</property> |
||||
|
||||
|
||||
<property> |
||||
<name>mapred.local.dir</name> |
||||
<value>{{ hadoop["mapred_local_dir"] | join(',') }}</value> |
||||
</property> |
||||
|
||||
<property> |
||||
<name>mapred.task.tracker.http.address</name> |
||||
<value>0.0.0.0:{{ hadoop['mapred_task_tracker_http_address_port'] }}</value> |
||||
</property> |
||||
|
||||
</configuration> |
@ -0,0 +1,3 @@ |
||||
{% for host in groups['hadoop_slaves'] %} |
||||
{{ host }} |
||||
{% endfor %} |
@ -0,0 +1,80 @@ |
||||
<?xml version="1.0"?> |
||||
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?> |
||||
<!-- |
||||
Licensed to the Apache Software Foundation (ASF) under one or more |
||||
contributor license agreements. See the NOTICE file distributed with |
||||
this work for additional information regarding copyright ownership. |
||||
The ASF licenses this file to You under the Apache License, Version 2.0 |
||||
(the "License"); you may not use this file except in compliance with |
||||
the License. You may obtain a copy of the License at |
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0 |
||||
|
||||
Unless required by applicable law or agreed to in writing, software |
||||
distributed under the License is distributed on an "AS IS" BASIS, |
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
||||
See the License for the specific language governing permissions and |
||||
limitations under the License. |
||||
--> |
||||
<configuration> |
||||
|
||||
<property> |
||||
<name>ssl.client.truststore.location</name> |
||||
<value></value> |
||||
<description>Truststore to be used by clients like distcp. Must be |
||||
specified. |
||||
</description> |
||||
</property> |
||||
|
||||
<property> |
||||
<name>ssl.client.truststore.password</name> |
||||
<value></value> |
||||
<description>Optional. Default value is "". |
||||
</description> |
||||
</property> |
||||
|
||||
<property> |
||||
<name>ssl.client.truststore.type</name> |
||||
<value>jks</value> |
||||
<description>Optional. The keystore file format, default value is "jks". |
||||
</description> |
||||
</property> |
||||
|
||||
<property> |
||||
<name>ssl.client.truststore.reload.interval</name> |
||||
<value>10000</value> |
||||
<description>Truststore reload check interval, in milliseconds. |
||||
Default value is 10000 (10 seconds). |
||||
</description> |
||||
</property> |
||||
|
||||
<property> |
||||
<name>ssl.client.keystore.location</name> |
||||
<value></value> |
||||
<description>Keystore to be used by clients like distcp. Must be |
||||
specified. |
||||
</description> |
||||
</property> |
||||
|
||||
<property> |
||||
<name>ssl.client.keystore.password</name> |
||||
<value></value> |
||||
<description>Optional. Default value is "". |
||||
</description> |
||||
</property> |
||||
|
||||
<property> |
||||
<name>ssl.client.keystore.keypassword</name> |
||||
<value></value> |
||||
<description>Optional. Default value is "". |
||||
</description> |
||||
</property> |
||||
|
||||
<property> |
||||
<name>ssl.client.keystore.type</name> |
||||
<value>jks</value> |
||||
<description>Optional. The keystore file format, default value is "jks". |
||||
</description> |
||||
</property> |
||||
|
||||
</configuration> |
@ -0,0 +1,77 @@ |
||||
<?xml version="1.0"?> |
||||
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?> |
||||
<!-- |
||||
Licensed to the Apache Software Foundation (ASF) under one or more |
||||
contributor license agreements. See the NOTICE file distributed with |
||||
this work for additional information regarding copyright ownership. |
||||
The ASF licenses this file to You under the Apache License, Version 2.0 |
||||
(the "License"); you may not use this file except in compliance with |
||||
the License. You may obtain a copy of the License at |
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0 |
||||
|
||||
Unless required by applicable law or agreed to in writing, software |
||||
distributed under the License is distributed on an "AS IS" BASIS, |
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
||||
See the License for the specific language governing permissions and |
||||
limitations under the License. |
||||
--> |
||||
<configuration> |
||||
|
||||
<property> |
||||
<name>ssl.server.truststore.location</name> |
||||
<value></value> |
||||
<description>Truststore to be used by NN and DN. Must be specified. |
||||
</description> |
||||
</property> |
||||
|
||||
<property> |
||||
<name>ssl.server.truststore.password</name> |
||||
<value></value> |
||||
<description>Optional. Default value is "". |
||||
</description> |
||||
</property> |
||||
|
||||
<property> |
||||
<name>ssl.server.truststore.type</name> |
||||
<value>jks</value> |
||||
<description>Optional. The keystore file format, default value is "jks". |
||||
</description> |
||||
</property> |
||||
|
||||
<property> |
||||
<name>ssl.server.truststore.reload.interval</name> |
||||
<value>10000</value> |
||||
<description>Truststore reload check interval, in milliseconds. |
||||
Default value is 10000 (10 seconds). |
||||
</property> |
||||
|
||||
<property> |
||||
<name>ssl.server.keystore.location</name> |
||||
<value></value> |
||||
<description>Keystore to be used by NN and DN. Must be specified. |
||||
</description> |
||||
</property> |
||||
|
||||
<property> |
||||
<name>ssl.server.keystore.password</name> |
||||
<value></value> |
||||
<description>Must be specified. |
||||
</description> |
||||
</property> |
||||
|
||||
<property> |
||||
<name>ssl.server.keystore.keypassword</name> |
||||
<value></value> |
||||
<description>Must be specified. |
||||
</description> |
||||
</property> |
||||
|
||||
<property> |
||||
<name>ssl.server.keystore.type</name> |
||||
<value>jks</value> |
||||
<description>Optional. The keystore file format, default value is "jks". |
||||
</description> |
||||
</property> |
||||
|
||||
</configuration> |
@ -0,0 +1,41 @@ |
||||
# Firewall configuration written by system-config-firewall |
||||
# Manual customization of this file is not recommended_ |
||||
*filter |
||||
:INPUT ACCEPT [0:0] |
||||
:FORWARD ACCEPT [0:0] |
||||
:OUTPUT ACCEPT [0:0] |
||||
{% if 'hadoop_masters' in group_names %} |
||||
-A INPUT -p tcp --dport {{ hadoop['fs_default_FS_port'] }} -j ACCEPT |
||||
-A INPUT -p tcp --dport {{ hadoop['dfs_namenode_http_address_port'] }} -j ACCEPT |
||||
-A INPUT -p tcp --dport {{ hadoop['mapred_job_tracker_port'] }} -j ACCEPT |
||||
-A INPUT -p tcp --dport {{ hadoop['mapred_job_tracker_http_address_port'] }} -j ACCEPT |
||||
-A INPUT -p tcp --dport {{ hadoop['mapred_ha_jobtracker_rpc-address_port'] }} -j ACCEPT |
||||
-A INPUT -p tcp --dport {{ hadoop['mapred_ha_zkfc_port'] }} -j ACCEPT |
||||
-A INPUT -p tcp --dport {{ hadoop['dfs_ha_zkfc_port'] }} -j ACCEPT |
||||
|
||||
{% endif %} |
||||
{% if 'hadoop_slaves' in group_names %} |
||||
-A INPUT -p tcp --dport {{ hadoop['dfs_datanode_address_port'] }} -j ACCEPT |
||||
-A INPUT -p tcp --dport {{ hadoop['dfs_datanode_http_address_port'] }} -j ACCEPT |
||||
-A INPUT -p tcp --dport {{ hadoop['dfs_datanode_ipc_address_port'] }} -j ACCEPT |
||||
-A INPUT -p tcp --dport {{ hadoop['mapred_task_tracker_http_address_port'] }} -j ACCEPT |
||||
{% endif %} |
||||
{% if 'qjournal_servers' in group_names %} |
||||
-A INPUT -p tcp --dport {{ hadoop['qjournal_port'] }} -j ACCEPT |
||||
-A INPUT -p tcp --dport {{ hadoop['qjournal_http_port'] }} -j ACCEPT |
||||
{% endif %} |
||||
{% if 'zookeeper_servers' in group_names %} |
||||
-A INPUT -p tcp --dport {{ hadoop['zookeeper_clientport'] }} -j ACCEPT |
||||
-A INPUT -p tcp --dport {{ hadoop['zookeeper_leader_port'] }} -j ACCEPT |
||||
-A INPUT -p tcp --dport {{ hadoop['zookeeper_election_port'] }} -j ACCEPT |
||||
{% endif %} |
||||
-A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT |
||||
-A INPUT -p icmp -j ACCEPT |
||||
-A INPUT -i lo -j ACCEPT |
||||
-A INPUT -m state --state NEW -m tcp -p tcp --dport 22 -j ACCEPT |
||||
-A INPUT -j REJECT --reject-with icmp-host-prohibited |
||||
-A FORWARD -j REJECT --reject-with icmp-host-prohibited |
||||
COMMIT |
||||
|
||||
|
||||
|
@ -0,0 +1,14 @@ |
||||
--- |
||||
# Handlers for the hadoop master services |
||||
|
||||
- name: restart hadoop master services |
||||
service: name=${item} state=restarted |
||||
with_items: |
||||
- hadoop-0.20-mapreduce-jobtracker |
||||
- hadoop-hdfs-namenode |
||||
|
||||
- name: restart hadoopha master services |
||||
service: name=${item} state=restarted |
||||
with_items: |
||||
- hadoop-0.20-mapreduce-jobtrackerha |
||||
- hadoop-hdfs-namenode |
@ -0,0 +1,38 @@ |
||||
--- |
||||
# Playbook for Hadoop master servers |
||||
|
||||
- name: Install the namenode and jobtracker packages |
||||
yum: name={{ item }} state=installed |
||||
with_items: |
||||
- hadoop-0.20-mapreduce-jobtrackerha |
||||
- hadoop-hdfs-namenode |
||||
- hadoop-hdfs-zkfc |
||||
- hadoop-0.20-mapreduce-zkfc |
||||
|
||||
- name: Copy the hadoop configuration files |
||||
template: src=roles/common/templates/hadoop_ha_conf/{{ item }}.j2 dest=/etc/hadoop/conf/{{ item }} |
||||
with_items: |
||||
- core-site.xml |
||||
- hadoop-metrics.properties |
||||
- hadoop-metrics2.properties |
||||
- hdfs-site.xml |
||||
- log4j.properties |
||||
- mapred-site.xml |
||||
- slaves |
||||
- ssl-client.xml.example |
||||
- ssl-server.xml.example |
||||
notify: restart hadoopha master services |
||||
|
||||
- name: Create the data directory for the namenode metadata |
||||
file: path={{ item }} owner=hdfs group=hdfs state=directory |
||||
with_items: hadoop.dfs_namenode_name_dir |
||||
|
||||
- name: Create the data directory for the jobtracker ha |
||||
file: path={{ item }} owner=mapred group=mapred state=directory |
||||
with_items: hadoop.mapred_job_tracker_persist_jobstatus_dir |
||||
|
||||
- name: Format the namenode |
||||
shell: creates=/usr/lib/hadoop/namenode.formatted su - hdfs -c "hadoop namenode -format"; touch /usr/lib/hadoop/namenode.formatted |
||||
|
||||
- name: start hadoop namenode services |
||||
service: name=hadoop-hdfs-namenode state=started |
@ -0,0 +1,38 @@ |
||||
--- |
||||
# Playbook for Hadoop master servers |
||||
|
||||
- name: Install the namenode and jobtracker packages |
||||
yum: name={{ item }} state=installed |
||||
with_items: |
||||
- hadoop-0.20-mapreduce-jobtracker |
||||
- hadoop-hdfs-namenode |
||||
|
||||
- name: Copy the hadoop configuration files for no ha |
||||
template: src=roles/common/templates/hadoop_conf/{{ item }}.j2 dest=/etc/hadoop/conf/{{ item }} |
||||
with_items: |
||||
- core-site.xml |
||||
- hadoop-metrics.properties |
||||
- hadoop-metrics2.properties |
||||
- hdfs-site.xml |
||||
- log4j.properties |
||||
- mapred-site.xml |
||||
- slaves |
||||
- ssl-client.xml.example |
||||
- ssl-server.xml.example |
||||
notify: restart hadoop master services |
||||
|
||||
- name: Create the data directory for the namenode metadata |
||||
file: path={{ item }} owner=hdfs group=hdfs state=directory |
||||
with_items: hadoop.dfs_namenode_name_dir |
||||
|
||||
- name: Format the namenode |
||||
shell: creates=/usr/lib/hadoop/namenode.formatted su - hdfs -c "hadoop namenode -format"; touch /usr/lib/hadoop/namenode.formatted |
||||
|
||||
- name: start hadoop namenode services |
||||
service: name=hadoop-hdfs-namenode state=started |
||||
|
||||
- name: Give permissions for mapred users |
||||
shell: creates=/usr/lib/hadoop/fs.initialized su - hdfs -c "hadoop fs -chown hdfs:hadoop /"; su - hdfs -c "hadoop fs -chmod 0774 /"; touch /usr/lib/hadoop/namenode.initialized |
||||
|
||||
- name: start hadoop jobtracker services |
||||
service: name=hadoop-0.20-mapreduce-jobtracker state=started |
@ -0,0 +1,9 @@ |
||||
--- |
||||
# Playbook for Hadoop master primary servers |
||||
|
||||
- include: hadoop_master.yml |
||||
when: ha_enabled |
||||
|
||||
- include: hadoop_master_no_ha.yml |
||||
when: not ha_enabled |
||||
|
@ -0,0 +1,14 @@ |
||||
--- |
||||
# Handlers for the hadoop master services |
||||
|
||||
- name: restart hadoop master services |
||||
service: name=${item} state=restarted |
||||
with_items: |
||||
- hadoop-0.20-mapreduce-jobtracker |
||||
- hadoop-hdfs-namenode |
||||
|
||||
- name: restart hadoopha master services |
||||
service: name=${item} state=restarted |
||||
with_items: |
||||
- hadoop-0.20-mapreduce-jobtrackerha |
||||
- hadoop-hdfs-namenode |
@ -0,0 +1,64 @@ |
||||
--- |
||||
# Playbook for Hadoop master secondary server |
||||
|
||||
|
||||
- name: Install the namenode and jobtracker packages |
||||
yum: name=${item} state=installed |
||||
with_items: |
||||
- hadoop-0.20-mapreduce-jobtrackerha |
||||
- hadoop-hdfs-namenode |
||||
- hadoop-hdfs-zkfc |
||||
- hadoop-0.20-mapreduce-zkfc |
||||
|
||||
- name: Copy the hadoop configuration files |
||||
template: src=roles/common/templates/hadoop_ha_conf/{{ item }}.j2 dest=/etc/hadoop/conf/{{ item }} |
||||
with_items: |
||||
- core-site.xml |
||||
- hadoop-metrics.properties |
||||
- hadoop-metrics2.properties |
||||
- hdfs-site.xml |
||||
- log4j.properties |
||||
- mapred-site.xml |
||||
- slaves |
||||
- ssl-client.xml.example |
||||
- ssl-server.xml.example |
||||
notify: restart hadoopha master services |
||||
|
||||
- name: Create the data directory for the namenode metadata |
||||
file: path={{ item }} owner=hdfs group=hdfs state=directory |
||||
with_items: hadoop.dfs_namenode_name_dir |
||||
|
||||
- name: Create the data directory for the jobtracker ha |
||||
file: path={{ item }} owner=mapred group=mapred state=directory |
||||
with_items: hadoop.mapred_job_tracker_persist_jobstatus_dir |
||||
|
||||
|
||||
- name: Initialize the secodary namenode |
||||
shell: creates=/usr/lib/hadoop/namenode.formatted su - hdfs -c "hadoop namenode -bootstrapStandby"; touch /usr/lib/hadoop/namenode.formatted |
||||
|
||||
- name: start hadoop namenode services |
||||
service: name=hadoop-hdfs-namenode state=started |
||||
|
||||
- name: Initialize the zkfc for namenode |
||||
shell: creates=/usr/lib/hadoop/zkfc.formatted su - hdfs -c "hdfs zkfc -formatZK"; touch /usr/lib/hadoop/zkfc.formatted |
||||
|
||||
- name: start zkfc for namenodes |
||||
service: name=hadoop-hdfs-zkfc state=started |
||||
delegate_to: ${item} |
||||
with_items: groups.hadoop_masters |
||||
|
||||
- name: Give permissions for mapred users |
||||
shell: creates=/usr/lib/hadoop/fs.initialized su - hdfs -c "hadoop fs -chown hdfs:hadoop /"; su - hdfs -c "hadoop fs -chmod 0774 /"; touch /usr/lib/hadoop/namenode.initialized |
||||
|
||||
- name: Initialize the zkfc for jobtracker |
||||
shell: creates=/usr/lib/hadoop/zkfcjob.formatted su - mapred -c "hadoop mrzkfc -formatZK"; touch /usr/lib/hadoop/zkfcjob.formatted |
||||
|
||||
- name: start zkfc for jobtracker |
||||
service: name=hadoop-0.20-mapreduce-zkfc state=started |
||||
delegate_to: '{{ item }}' |
||||
with_items: groups.hadoop_masters |
||||
|
||||
- name: start hadoop Jobtracker services |
||||
service: name=hadoop-0.20-mapreduce-jobtrackerha state=started |
||||
delegate_to: '{{ item }}' |
||||
with_items: groups.hadoop_masters |
@ -0,0 +1,8 @@ |
||||
--- |
||||
# Handlers for the hadoop slave services |
||||
|
||||
- name: restart hadoop slave services |
||||
service: name=${item} state=restarted |
||||
with_items: |
||||
- hadoop-0.20-mapreduce-tasktracker |
||||
- hadoop-hdfs-datanode |
@ -0,0 +1,4 @@ |
||||
--- |
||||
# Playbook for Hadoop slave servers |
||||
|
||||
- include: slaves.yml tags=slaves |
@ -0,0 +1,53 @@ |
||||
--- |
||||
# Playbook for Hadoop slave servers |
||||
|
||||
- name: Install the datanode and tasktracker packages |
||||
yum: name=${item} state=installed |
||||
with_items: |
||||
- hadoop-0.20-mapreduce-tasktracker |
||||
- hadoop-hdfs-datanode |
||||
|
||||
- name: Copy the hadoop configuration files |
||||
template: src=roles/common/templates/hadoop_ha_conf/${item}.j2 dest=/etc/hadoop/conf/${item} |
||||
with_items: |
||||
- core-site.xml |
||||
- hadoop-metrics.properties |
||||
- hadoop-metrics2.properties |
||||
- hdfs-site.xml |
||||
- log4j.properties |
||||
- mapred-site.xml |
||||
- slaves |
||||
- ssl-client.xml.example |
||||
- ssl-server.xml.example |
||||
when: ha_enabled |
||||
notify: restart hadoop slave services |
||||
|
||||
- name: Copy the hadoop configuration files for non ha |
||||
template: src=roles/common/templates/hadoop_conf/${item}.j2 dest=/etc/hadoop/conf/${item} |
||||
with_items: |
||||
- core-site.xml |
||||
- hadoop-metrics.properties |
||||
- hadoop-metrics2.properties |
||||
- hdfs-site.xml |
||||
- log4j.properties |
||||
- mapred-site.xml |
||||
- slaves |
||||
- ssl-client.xml.example |
||||
- ssl-server.xml.example |
||||
when: not ha_enabled |
||||
notify: restart hadoop slave services |
||||
|
||||
- name: Create the data directory for the slave nodes to store the data |
||||
file: path={{ item }} owner=hdfs group=hdfs state=directory |
||||
with_items: hadoop.dfs_datanode_data_dir |
||||
|
||||
- name: Create the data directory for the slave nodes for mapreduce |
||||
file: path={{ item }} owner=mapred group=mapred state=directory |
||||
with_items: hadoop.mapred_local_dir |
||||
|
||||
- name: start hadoop slave services |
||||
service: name={{ item }} state=started |
||||
with_items: |
||||
- hadoop-0.20-mapreduce-tasktracker |
||||
- hadoop-hdfs-datanode |
||||
|
@ -0,0 +1,5 @@ |
||||
--- |
||||
# The journal node handlers |
||||
|
||||
- name: restart qjournal services |
||||
service: name=hadoop-hdfs-journalnode state=restarted |
@ -0,0 +1,22 @@ |
||||
--- |
||||
# Playbook for the qjournal nodes |
||||
|
||||
- name: Install the qjournal package |
||||
yum: name=hadoop-hdfs-journalnode state=installed |
||||
|
||||
- name: Create folder for Journaling |
||||
file: path={{ hadoop.dfs_journalnode_edits_dir }} state=directory owner=hdfs group=hdfs |
||||
|
||||
- name: Copy the hadoop configuration files |
||||
template: src=roles/common/templates/hadoop_ha_conf/{{ item }}.j2 dest=/etc/hadoop/conf/{{ item }} |
||||
with_items: |
||||
- core-site.xml |
||||
- hadoop-metrics.properties |
||||
- hadoop-metrics2.properties |
||||
- hdfs-site.xml |
||||
- log4j.properties |
||||
- mapred-site.xml |
||||
- slaves |
||||
- ssl-client.xml.example |
||||
- ssl-server.xml.example |
||||
notify: restart qjournal services |
@ -0,0 +1,5 @@ |
||||
--- |
||||
# Handler for the zookeeper services |
||||
|
||||
- name: restart zookeeper |
||||
service: name=zookeeper-server state=restarted |
@ -0,0 +1,13 @@ |
||||
--- |
||||
# The plays for zookeper daemons |
||||
|
||||
- name: Install the zookeeper files |
||||
yum: name=zookeeper-server state=installed |
||||
|
||||
- name: Copy the configuration file for zookeeper |
||||
template: src=zoo.cfg.j2 dest=/etc/zookeeper/conf/zoo.cfg |
||||
notify: restart zookeeper |
||||
|
||||
- name: initialize the zookeper |
||||
shell: creates=/var/lib/zookeeper/myid service zookeeper-server init --myid=${zoo_id} |
||||
|
@ -0,0 +1,9 @@ |
||||
tickTime=2000 |
||||
dataDir=/var/lib/zookeeper/ |
||||
clientPort={{ hadoop['zookeeper_clientport'] }} |
||||
initLimit=5 |
||||
syncLimit=2 |
||||
{% for host in groups['zookeeper_servers'] %} |
||||
server.{{ hostvars[host].zoo_id }}={{ host }}:{{ hadoop['zookeeper_leader_port'] }}:{{ hadoop['zookeeper_election_port'] }} |
||||
{% endfor %} |
||||
|
@ -0,0 +1,28 @@ |
||||
--- |
||||
# The main playbook to deploy the site |
||||
|
||||
- hosts: hadoop_all |
||||
roles: |
||||
- common |
||||
|
||||
- hosts: zookeeper_servers |
||||
roles: |
||||
- { role: zookeeper_servers, when: ha_enabled } |
||||
|
||||
- hosts: qjournal_servers |
||||
roles: |
||||
- { role: qjournal_servers, when: ha_enabled } |
||||
|
||||
- hosts: hadoop_master_primary |
||||
roles: |
||||
- { role: hadoop_primary } |
||||
|
||||
- hosts: hadoop_master_secondary |
||||
roles: |
||||
- { role: hadoop_secondary, when: ha_enabled } |
||||
|
||||
- hosts: hadoop_slaves |
||||
roles: |
||||
- { role: hadoop_slaves } |
||||
|
||||
|