- Add missing EPEL GPG key

- Word-wrap and update README.md - iface automatically gets a default - move mongod_ports array to a per-host inventory variable instead - use copy instead of template for non-templated files - create the mongod system user as needed
11 years ago · e606dba46f
parent afe0195875
commit e606dba46f
12 changed files with 137 additions and 63 deletions
--- a/mongodb/README.md
+++ b/mongodb/README.md
@ -1,15 +1,24 @@
 ##Deploying a sharded production ready MongoDB cluster with Ansible
 ------------------------------------------------------------------------------

+- Requires Ansible 1.2
+- Expects CentOS/RHEL 6 hosts
+
 ###A Primer into the MongoDB NoSQL database.
 ---------------------------------------------

 ![Alt text](/images/nosql_primer.png "Primer NoSQL")

-The above diagram shows how the MongoDB nosql differs from the traditional relational database model. In RDBMS the data of a user is stored in table and the  records of users are stored in rows/columns, While in mongodb the 'table' is replaced by 'collection' and the individual 'records' are called 'documents'.
-One thing also to be noticed is that the data is stored as key/value pairs in BJSON format.
+The above diagram shows how the MongoDB nosql differs from the traditional
+relational database model. In RDBMS the data of a user is stored in table and
+the  records of users are stored in rows/columns, While in mongodb the 'table'
+is replaced by 'collection' and the individual 'records' are called
+'documents'.  One thing also to be noticed is that the data is stored as
+key/value pairs in BJSON format.

-Another thing to be noticed is that nosql has a looser consistency model, as an example the second document in the users collection has an additonal field of 'last name'. Due to this flexibility the nosql database model can give us:
+Another thing to be noticed is that nosql has a looser consistency model, as an
+example the second document in the users collection has an additonal field of
+'last name'. Due to this flexibility the nosql database model can give us:

 Better Horizontal scaling capability.

@ -17,7 +26,8 @@ Also mongodb has inbuilt support for

 Data Replication & HA

-Which makes it good choice for users who have very large data to handle and less requirement for ACID.
+Which makes it good choice for users who have very large data to handle and
+less requirement for ACID.

 
 ### MongoDB's Data replication .
@ -26,7 +36,13 @@ Which makes it good choice for users who have very large data to handle and less
 ![Alt text](/images/replica_set.png "Replica Set")


-Data backup is achieved in Mongodb via Replica sets. As the figure above show's a single replication set consists of a replication master (active) and several other replications slaves (passive). All the database operations like Add/Delete/Update happens on the replication master and the master replicates the data to the slave nodes. mongod is the process which is resposible for all the database activities as well as replication processes. The minimum recommended number of slave servers are 3.
+Data backup is achieved in Mongodb via Replica sets. As the figure above show's
+a single replication set consists of a replication master (active) and several
+other replications slaves (passive). All the database operations like
+Add/Delete/Update happens on the replication master and the master replicates
+the data to the slave nodes. mongod is the process which is resposible for all
+the database activities as well as replication processes. The minimum
+recommended number of slave servers are 3.


 ### MongoDB's Sharding (Horizontal Scaling) .
@ -34,17 +50,29 @@ Data backup is achieved in Mongodb via Replica sets. As the figure above show's

 ![Alt text](/images/sharding.png "Sharding")

-Sharding allows to achieve a very high performing database, by partioning the data into seperate chunks and allocating diffent ranges of chunks to diffrent shard servers. The figure above shows a collection which has 90 documents which has been sharded across the three shard server, The first shard getting ranges from 1- 29 etc... . When a client wants to access a certian document it contacts the query router (mongos process), which inturn would contact the 'configuration node' (lightweight mongod process) which keeps a record of which ranges of chunks are distributed across which shards. 
+Sharding allows to achieve a very high performing database, by partioning the
+data into seperate chunks and allocating diffent ranges of chunks to diffrent
+shard servers. The figure above shows a collection which has 90 documents which
+has been sharded across the three shard server, The first shard getting ranges
+from 1- 29 etc... . When a client wants to access a certian document it
+contacts the query router (mongos process), which inturn would contact the
+'configuration node' (lightweight mongod process) which keeps a record of which
+ranges of chunks are distributed across which shards. 

-Please do note that every shard server should be backed by a replica set, so that when data is written/queried copies of the data are available. So in a three shard deployment we would require 3 replica sets and primaries of each would act as the sharding server.
+Please do note that every shard server should be backed by a replica set, so
+that when data is written/queried copies of the data are available. So in a
+three shard deployment we would require 3 replica sets and primaries of each
+would act as the sharding server.

 Here's a basic steps of how sharding works. 

 1) A new database is created, and collections are added.

-2) New documents get updated as an when clients update, all the new documents goes into a single shard.
+2) New documents get updated as an when clients update, all the new documents
+goes into a single shard.

-3) when the size of collection in a shard exceeds the 'chunk_size' the collection is split and balanced across shards.
+3) when the size of collection in a shard exceeds the 'chunk_size' the
+collection is split and balanced across shards.


 ###Deploy MongoDB cluster via Ansible.
@ -55,20 +83,26 @@ Here's a basic steps of how sharding works.

 ![Alt text](/images/site.png "Site")
  
-The above diagram illustrates the deployment model for mongodb cluster via Ansible, This deployment models focuses on deploying a three shard servers, each having a replica set, the backup replica servers are other two shard primaries. The configuration server are co-located with the shard's. The mongos servers are best deployed on seperate servers. These are the minimum recomended configuration for a production grade mongodb deployment.
-Please note that the playbooks are capable of deploying N node cluster not necesarily three. Also all the processes are secured using keyfiles.
+The above diagram illustrates the deployment model for mongodb cluster via
+Ansible, This deployment models focuses on deploying a three shard servers,
+each having a replica set, the backup replica servers are other two shard
+primaries. The configuration server are co-located with the shard's. The mongos
+servers are best deployed on seperate servers. These are the minimum recomended
+configuration for a production grade mongodb deployment.  Please note that the
+playbooks are capable of deploying N node cluster not necesarily three. Also
+all the processes are secured using keyfiles.

 ###Pre-Requisite's

 Edit the group_vars/all file to reflect the below variables.

 1) iface: 'eth1'     # the interface to be used for all communication.
-2) mongod_ports:     # The hostname and tcp/ip port combination.
-     mongo1: 2700
-     mongo2: 2701
-     mongo3: 2702
 		
-3) The default directory for storing data is /data, please do change it if requried, also make sure it has sufficient space 10G recommended.
+2) Set a unique mongod_port variable in the inventory file for each MongoDB
+server.
+
+3) The default directory for storing data is /data, please do change it if
+requried, also make sure it has sufficient space 10G recommended.

 ###Once the pre-requisite's have been done, we can  procced with the site deployment. The following example deploys a three node MongoDB Cluster

@ -76,9 +110,9 @@ The inventory file looks as follows:

 		#The site wide list of mongodb servers
 		[mongo_servers]
-		mongo1
-		mongo2
-		mongo3
+		mongo1 mongod_port=2700
+		mongo2 mongod_port=2701
+		mongo3 mongod_port=2702

 		#The list of servers where replication should happen, including the master server.
 		[replication_servers]
@ -105,8 +139,10 @@ Build the site with the following command:
 ###Verifying the deployed MongoDB Cluster
 ---------------------------------------------

-Once completed we can check replication set availibitly by connecting to individual primary replication set nodes, 'mongo --host 192.168.1.1 --port 2700 
-and issue the command to query the status of replication set, we should get a similar output.
+Once completed we can check replication set availibitly by connecting to
+individual primary replication set nodes, 'mongo --host 192.168.1.1 --port 2700'
+and issue the command to query the status of replication set, we should get a
+similar output.

 		
 		web2:PRIMARY> rs.status()
@ -143,8 +179,9 @@ and issue the command to query the status of replication set, we should get a si
 		}


-we can check the status of the Shards as follows: connect to the mongos service 'mongos --host 192.168.1.1 --port 8888'
-and issue the following command to get the status of the Shards.
+we can check the status of the Shards as follows: connect to the mongos service
+'mongo localhost:8888/admin -u admin -p 123456' and issue the following command to get
+the status of the Shards.


 		 
@ -165,12 +202,15 @@ and issue the following command to get the status of the Shards.

 The above mentioned steps can be tested with an automated playbook.

-Issue the following command to run the test. In variable passed make sure the servername is one of any mongos server.
+Issue the following command to run the test. In variable passed make sure the
+servername is one of any mongos server.
 		
 		ansible-playbook -i hosts playbooks/testsharding.yml -e servername=mongos


-Once the playbook completes, we check if the shadring has succeded by logging on to any mongos server and issuing the following command. The output display the number of chunks spread across the shards.
+Once the playbook completes, we check if the shadring has succeded by logging
+on to any mongos server and issuing the following command. The output display
+the number of chunks spread across the shards.

 		mongos> sh.status()
 			--- Sharding Status --- 
@ -201,10 +241,10 @@ To add a new node to the configured MongoDb Cluster, setup the inventory file as

 		#The site wide list of mongodb servers
 		[mongoservers]
-		mongo1
-		mongo2
-		mongo3
-		mongo4
+		mongo1 mongod_port=2700
+		mongo2 mongod_port=2701
+		mongo3 mongod_port=2702
+		mongo4 mongod_port=2703

 		#The list of servers where replication should happen, make sure the new node is listed here.
 		[replicationservers]
@ -224,14 +264,16 @@ To add a new node to the configured MongoDb Cluster, setup the inventory file as
 		mongos1
 		mongos2

-Make sure you have the new node added in the replicationservers section and execute the following command:
+Make sure you have the new node added in the replicationservers section and
+execute the following command:

 		ansible-playbook -i hosts site.yml

 ###Verification.
 -----------------------------

-The verification of the newly added node can be as easy checking the sharding status and see the chunks being rebalanced to the newly added node.
+The verification of the newly added node can be as easy checking the sharding
+status and see the chunks being rebalanced to the newly added node.

 			$/usr/bin/mongo localhost:8888/admin -u admin -p 123456
 			mongos> sh.status()
--- a/mongodb/group_vars/all
+++ b/mongodb/group_vars/all
@ -1,26 +1,25 @@
-#The Global variable file mongodb installation
+# The global variable file mongodb installation

-#The chunksize for shards in MB
+# The chunksize for shards in MB
 mongos_chunk_size: 1

-#The port in which mongos server should listen on 
+# The port in which mongos server should listen on 
 mongos_port: 8888

-#The port for mongo config server
+# The port for mongo config server
 mongoc_port: 7777

-#The directory prefix where the database files would be stored
+# The directory prefix where the database files would be stored
 mongodb_datadir_prefix: /data/

-#The interface where the mongodb process should listen on.
-iface: eth1
+# The interface where the mongodb process should listen on.
+# Defaults to the first interface. Change this to:
+# 
+#  iface: eth1
+#
+# ...to override.
+# 
+iface: '{{ ansible_default_ipv4.interface }}'

-#The password for admin user
+# The password for admin user
 mongo_admin_pass: 123456
-
-mongod_ports:
-  hadoop1: 2700
-  hadoop2: 2701
-  hadoop3: 2702
-  hadoop4: 2703
-
--- a/mongodb/hosts
+++ b/mongodb/hosts
@ -1,10 +1,11 @@
 #The site wide list of mongodb servers

+# the mongo servers need a mongod_port variable set, and they must not conflict.
 [mongo_servers]
-hadoop1
-hadoop2
-hadoop3
-hadoop4
+hadoop1 mongod_port=2700
+hadoop2 mongod_port=2701
+hadoop3 mongod_port=2702
+hadoop4 mongod_port=2703

 #The list of servers where replication should happen, by default include all servers
 [replication_servers]
--- a/mongodb/roles/common/templates/10gen.repo.j2
+++ b/mongodb/roles/common/templates/10gen.repo.j2
--- a/mongodb/roles/common/files/RPM-GPG-KEY-EPEL-6
+++ b/mongodb/roles/common/files/RPM-GPG-KEY-EPEL-6
@ -0,0 +1,29 @@
+-----BEGIN PGP PUBLIC KEY BLOCK-----
+Version: GnuPG v1.4.5 (GNU/Linux)
+
+mQINBEvSKUIBEADLGnUj24ZVKW7liFN/JA5CgtzlNnKs7sBg7fVbNWryiE3URbn1
+JXvrdwHtkKyY96/ifZ1Ld3lE2gOF61bGZ2CWwJNee76Sp9Z+isP8RQXbG5jwj/4B
+M9HK7phktqFVJ8VbY2jfTjcfxRvGM8YBwXF8hx0CDZURAjvf1xRSQJ7iAo58qcHn
+XtxOAvQmAbR9z6Q/h/D+Y/PhoIJp1OV4VNHCbCs9M7HUVBpgC53PDcTUQuwcgeY6
+pQgo9eT1eLNSZVrJ5Bctivl1UcD6P6CIGkkeT2gNhqindRPngUXGXW7Qzoefe+fV
+QqJSm7Tq2q9oqVZ46J964waCRItRySpuW5dxZO34WM6wsw2BP2MlACbH4l3luqtp
+Xo3Bvfnk+HAFH3HcMuwdaulxv7zYKXCfNoSfgrpEfo2Ex4Im/I3WdtwME/Gbnwdq
+3VJzgAxLVFhczDHwNkjmIdPAlNJ9/ixRjip4dgZtW8VcBCrNoL+LhDrIfjvnLdRu
+vBHy9P3sCF7FZycaHlMWP6RiLtHnEMGcbZ8QpQHi2dReU1wyr9QgguGU+jqSXYar
+1yEcsdRGasppNIZ8+Qawbm/a4doT10TEtPArhSoHlwbvqTDYjtfV92lC/2iwgO6g
+YgG9XrO4V8dV39Ffm7oLFfvTbg5mv4Q/E6AWo/gkjmtxkculbyAvjFtYAQARAQAB
+tCFFUEVMICg2KSA8ZXBlbEBmZWRvcmFwcm9qZWN0Lm9yZz6JAjYEEwECACAFAkvS
+KUICGw8GCwkIBwMCBBUCCAMEFgIDAQIeAQIXgAAKCRA7Sd8qBgi4lR/GD/wLGPv9
+qO39eyb9NlrwfKdUEo1tHxKdrhNz+XYrO4yVDTBZRPSuvL2yaoeSIhQOKhNPfEgT
+9mdsbsgcfmoHxmGVcn+lbheWsSvcgrXuz0gLt8TGGKGGROAoLXpuUsb1HNtKEOwP
+Q4z1uQ2nOz5hLRyDOV0I2LwYV8BjGIjBKUMFEUxFTsL7XOZkrAg/WbTH2PW3hrfS
+WtcRA7EYonI3B80d39ffws7SmyKbS5PmZjqOPuTvV2F0tMhKIhncBwoojWZPExft
+HpKhzKVh8fdDO/3P1y1Fk3Cin8UbCO9MWMFNR27fVzCANlEPljsHA+3Ez4F7uboF
+p0OOEov4Yyi4BEbgqZnthTG4ub9nyiupIZ3ckPHr3nVcDUGcL6lQD/nkmNVIeLYP
+x1uHPOSlWfuojAYgzRH6LL7Idg4FHHBA0to7FW8dQXFIOyNiJFAOT2j8P5+tVdq8
+wB0PDSH8yRpn4HdJ9RYquau4OkjluxOWf0uRaS//SUcCZh+1/KBEOmcvBHYRZA5J
+l/nakCgxGb2paQOzqqpOcHKvlyLuzO5uybMXaipLExTGJXBlXrbbASfXa/yGYSAG
+iVrGz9CE6676dMlm8F+s3XXE13QZrXmjloc6jwOljnfAkjTGXjiB7OULESed96MR
+XtfLk0W5Ab9pd7tKDR6QHI7rgHXfCopRnZ2VVQ==
+=V/6I
+-----END PGP PUBLIC KEY BLOCK-----
--- a/mongodb/roles/common/templates/epel.repo.j2
+++ b/mongodb/roles/common/templates/epel.repo.j2
--- a/mongodb/roles/common/tasks/main.yml
+++ b/mongodb/roles/common/tasks/main.yml
@ -4,11 +4,17 @@
 - name: Create the hosts file for all machines
  template: src=hosts.j2 dest=/etc/hosts

- name: Creates the repository for 10Gen  
-  template: src=10gen.repo.j2 dest=/etc/yum.repos.d/10gen.repo
+- name: Create the repository for 10Gen  
+  copy: src=10gen.repo.j2 dest=/etc/yum.repos.d/10gen.repo

- name: Copy the EPEL Repository.
-  template: src=epel.repo.j2 dest=/etc/yum.repos.d/epel.repo
+- name: Create the EPEL Repository.
+  copy: src=epel.repo.j2 dest=/etc/yum.repos.d/epel.repo
+
+- name: Create the GPG key for EPEL
+  copy: src=RPM-GPG-KEY-EPEL-6 dest=/etc/pki/rpm-gpg
+
+- name: Create the mongod user
+  user: name=mongod comment="MongoD"

 - name: Create the data directory for the namenode metadata
  file: path={{ mongodb_datadir_prefix }} owner=mongod group=mongod state=directory
@ -27,4 +33,3 @@
 - name: Create the iptables file
  template: src=iptables.j2 dest=/etc/sysconfig/iptables
  notify: restart iptables 
-
--- a/mongodb/roles/common/templates/iptables.j2
+++ b/mongodb/roles/common/templates/iptables.j2
@ -12,7 +12,7 @@
 {% endif %}
 {% if 'mongo_servers' in group_names %}
 {% for host in groups['mongo_servers'] %}
-A INPUT -p tcp  --dport {{ mongod_ports[host] }} -j  ACCEPT
+-A INPUT -p tcp  --dport {{ hostvars[host]['mongod_port'] }} -j  ACCEPT
 {% endfor %}
 {% endif %}
 -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
--- a/mongodb/roles/mongod/tasks/main.yml
+++ b/mongodb/roles/mongod/tasks/main.yml
@ -1,5 +1,5 @@
 ---
-#This Playbook deploys the mongod processes and sets up the replication set.
+# This role deploys the mongod processes and sets up the replication set.

 - name: create data directory for mongodb
  file: path={{ mongodb_datadir_prefix }}/mongo-{{ inventory_hostname }} state=directory owner=mongod group=mongod
@ -34,6 +34,4 @@
  pause: seconds=20

 - name: Initialize the replication set
-  shell: /usr/bin/mongo --port "${mongod_ports.${inventory_hostname}}" /tmp/repset_init.js 
-
-
+  shell: /usr/bin/mongo --port "{{ mongod_port }}" /tmp/repset_init.js 
--- a/mongodb/roles/mongod/templates/mongod.conf.j2
+++ b/mongodb/roles/mongod/templates/mongod.conf.j2
@ -9,7 +9,7 @@ logappend=true
 # fork and run in background
 fork = true

-port = {{ mongod_ports[inventory_hostname] }}
+port = {{ mongod_port }}

 dbpath={{ mongodb_datadir_prefix }}mongo-{{ inventory_hostname }}
 keyFile={{ mongodb_datadir_prefix }}/secret
--- a/mongodb/roles/mongod/templates/repset_init.j2
+++ b/mongodb/roles/mongod/templates/repset_init.j2
@ -1,7 +1,7 @@
 rs.initiate()
 sleep(13000)
 {% for host in groups['replication_servers'] %}
-rs.add("{{ host }}:{{ mongod_ports[inventory_hostname] }}")
+rs.add("{{ host }}:{{ mongod_port }}")
 sleep(8000)
 {% endfor %}
 printjson(rs.status())
--- a/mongodb/roles/mongod/templates/shard_init.j2
+++ b/mongodb/roles/mongod/templates/shard_init.j2
@ -1,2 +1,2 @@
-sh.addShard("{{ inventory_hostname}}/{{ inventory_hostname }}:{{ mongod_ports[inventory_hostname] }}")
+sh.addShard("{{ inventory_hostname}}/{{ inventory_hostname }}:{{ mongod_port }}")
 printjson(rs.status())