edits to the MongoDB README.md file

11 years ago · 12e49df911
parent 0509255c13
commit 12e49df911
1 changed files with 61 additions and 69 deletions
--- a/mongodb/README.md
+++ b/mongodb/README.md
@ -1,98 +1,86 @@
-##Deploying a sharded production ready MongoDB cluster with Ansible
+##Deploying a sharded, production-ready MongoDB cluster with Ansible
 ------------------------------------------------------------------------------

 - Requires Ansible 1.2
 - Expects CentOS/RHEL 6 hosts

-###A Primer into the MongoDB NoSQL database.
+### A Primer
 ---------------------------------------------

 ![Alt text](/images/nosql_primer.png "Primer NoSQL")

-The above diagram shows how the MongoDB nosql differs from the traditional
-relational database model. In RDBMS the data of a user is stored in table and
-the  records of users are stored in rows/columns, While in mongodb the 'table'
-is replaced by 'collection' and the individual 'records' are called
-'documents'.  One thing also to be noticed is that the data is stored as
-key/value pairs in BJSON format.
-
-Another thing to be noticed is that nosql has a looser consistency model, as an
-example the second document in the users collection has an additonal field of
-'last name'. Due to this flexibility the nosql database model can give us:
-
-Better Horizontal scaling capability.
-
-Also mongodb has inbuilt support for 
-
-Data Replication & HA
-
-Which makes it good choice for users who have very large data to handle and
-less requirement for ACID.
+The above diagram shows how MongoDB differs from the traditional relational
+database model. In an RDBMS, the data associated with 'user' is stored in a
+table, and the records of users are stored in rows and columns. In MongoDB, the
+'table' is replaced by a 'collection' and the individual 'records' are called
+'documents'.  One thing to notice is that the data is stored as key/value pairs
+in BJSON format.

+Another thing to notice is that NoSQL-style databases have a looser consistency
+model. As an example, the second document in the users collection has an
+additonal field of 'last name'.
 
-### MongoDB's Data replication .
+### Data Replication
 ------------------------------------

 ![Alt text](/images/replica_set.png "Replica Set")

-
-Data backup is achieved in Mongodb via Replica sets. As the figure above show's
+Data backup is achieved in MongoDB via _replica sets_. As the figure above shows,
 a single replication set consists of a replication master (active) and several
 other replications slaves (passive). All the database operations like
-Add/Delete/Update happens on the replication master and the master replicates
-the data to the slave nodes. mongod is the process which is resposible for all
+add/delete/update happen on the replication master and the master replicates
+the data to the slave nodes. _mongod_ is the process which is resposible for all
 the database activities as well as replication processes. The minimum
 recommended number of slave servers are 3.

-
-### MongoDB's Sharding (Horizontal Scaling) .
+### Sharding (Horizontal Scaling) .
 ------------------------------------------------

 ![Alt text](/images/sharding.png "Sharding")

-Sharding allows to achieve a very high performing database, by partioning the
-data into seperate chunks and allocating diffent ranges of chunks to diffrent
-shard servers. The figure above shows a collection which has 90 documents which
-has been sharded across the three shard server, The first shard getting ranges
-from 1- 29 etc... . When a client wants to access a certian document it
-contacts the query router (mongos process), which inturn would contact the
-'configuration node' (lightweight mongod process) which keeps a record of which
-ranges of chunks are distributed across which shards. 
+Sharding works by partioning the data into seperate chunks and allocating
+diffent ranges of chunks to diffrent shard servers. The figure above shows a
+collection which has 90 documents which have been sharded across the three
+server: the first shard getting ranges from 1-29,  and so on. When a client wants
+to access a certian document it contacts the query router (mongos process),
+which in turn contacts the 'configuration node', a lightweight mongod
+process) that keeps a record of which ranges of chunks are distributed across
+which shards. 

 Please do note that every shard server should be backed by a replica set, so
 that when data is written/queried copies of the data are available. So in a
-three shard deployment we would require 3 replica sets and primaries of each
+three-shard deployment we would require 3 replica sets and primaries of each
 would act as the sharding server.

-Here's a basic steps of how sharding works. 
+Here are the basic steps of how sharding works:

 1) A new database is created, and collections are added.

-2) New documents get updated as an when clients update, all the new documents
+2) New documents get updated when clients update, and all the new documents
 goes into a single shard.

-3) when the size of collection in a shard exceeds the 'chunk_size' the
+3) When the size of collection in a shard exceeds the 'chunk_size' the
 collection is split and balanced across shards.


-###Deploy MongoDB cluster via Ansible.
+### Deploying MongoDB Ansible
 --------------------------------------------

-### Deploy the Cluster.
+#### Deploy the Cluster
 ----------------------------

 ![Alt text](/images/site.png "Site")
  
-The above diagram illustrates the deployment model for mongodb cluster via
-Ansible, This deployment models focuses on deploying a three shard servers,
-each having a replica set, the backup replica servers are other two shard
-primaries. The configuration server are co-located with the shard's. The mongos
-servers are best deployed on seperate servers. These are the minimum recomended
-configuration for a production grade mongodb deployment.  Please note that the
-playbooks are capable of deploying N node cluster not necesarily three. Also
+The diagram above illustrates the deployment model for a MongoDB cluster deployed by
+Ansible. This deployment model focuses on deploying three shard servers,
+each having a replica set, with the backup replica servers serving as the other two shard
+primaries. The configuration servers are co-located with the shards. The _mongos_
+servers are best deployed on seperate servers. This is the minimum recomended
+configuration for a production-grade MongoDB deployment. Please note that the
+playbooks are capable of deploying N node clusters, not limited to three. Also,
 all the processes are secured using keyfiles.

-###Pre-Requisite's
+#### Prerequisite

 Edit the group_vars/all file to reflect the below variables.

@ -102,9 +90,9 @@ Edit the group_vars/all file to reflect the below variables.
 server.

 3) The default directory for storing data is /data, please do change it if
-requried, also make sure it has sufficient space 10G recommended.
+required. Make sure it has sufficient space: 10G is recommended.

-###Once the pre-requisite's have been done, we can  procced with the site deployment. The following example deploys a three node MongoDB Cluster
+### Deployment Example

 The inventory file looks as follows:

@ -136,13 +124,13 @@ Build the site with the following command:
 		ansible-playbook -i hosts site.yml


-###Verifying the deployed MongoDB Cluster
+#### Verifying the Deployment 
 ---------------------------------------------

-Once completed we can check replication set availibitly by connecting to
-individual primary replication set nodes, 'mongo --host 192.168.1.1 --port 2700'
-and issue the command to query the status of replication set, we should get a
-similar output.
+Once configuration and deployment has completed we can check replication set
+availibitly by connecting to individual primary replication set nodes, 'mongo
+--host 192.168.1.1 --port 2700' and issue the command to query the status of
+replication set, we should get a similar output.

 		
 		web2:PRIMARY> rs.status()
@ -179,9 +167,9 @@ similar output.
 		}


-we can check the status of the Shards as follows: connect to the mongos service
+We can check the status of the shards as follows: connect to the mongos service
 'mongo localhost:8888/admin -u admin -p 123456' and issue the following command to get
-the status of the Shards.
+the status of the Shards:


 		 
@ -195,21 +183,25 @@ the status of the Shards.
 			{  "_id" : "admin",  "partitioned" : false,  "primary" : "config" }


-###We can also make sure the Sharding works by creating a database,collection and populate it with documents and check if the chunks of the collection are balanced equally across nodes. The below diagram illustrates the verification step.
+We can also make sure the sharding works by creating a database, a collection,
+and populate it with documents and check if the chunks of the collection are
+balanced equally across nodes. The below diagram illustrates the verification
+step.
+
 -------------------------------------------------------------------------------------------------------------------------------------------------------------

 ![Alt text](/images/check.png "check")

 The above mentioned steps can be tested with an automated playbook.

-Issue the following command to run the test. In variable passed make sure the
-servername is one of any mongos server.
+Issue the following command to run the test. Pass one of the _mongos_ servers
+in the _servername_ variable.
 		
-		ansible-playbook -i hosts playbooks/testsharding.yml -e servername=mongos
+		ansible-playbook -i hosts playbooks/testsharding.yml -e servername=server1


-Once the playbook completes, we check if the shadring has succeded by logging
-on to any mongos server and issuing the following command. The output display
+Once the playbook completes, we check if the sharding has succeeded by logging
+on to any mongos server and issuing the following command. The output displays
 the number of chunks spread across the shards.

 		mongos> sh.status()
@ -237,7 +229,7 @@ the number of chunks spread across the shards.

 ![Alt text](/images/scale.png "scale")

-To add a new node to the configured MongoDb Cluster, setup the inventory file as follows:
+To add a new node to the existing MongoDB Cluster, modify the inventory file as follows:

 		#The site wide list of mongodb servers
 		[mongoservers]
@ -264,7 +256,7 @@ To add a new node to the configured MongoDb Cluster, setup the inventory file as
 		mongos1
 		mongos2

-Make sure you have the new node added in the replicationservers section and
+Make sure you have the new node added in the _replicationservers_ section and
 execute the following command:

 		ansible-playbook -i hosts site.yml
@ -272,8 +264,8 @@ execute the following command:
 ###Verification.
 -----------------------------

-The verification of the newly added node can be as easy checking the sharding
-status and see the chunks being rebalanced to the newly added node.
+The newly added node can be easily verified by checking the sharding status and
+seeing the chunks being rebalanced to the newly added node.

 			$/usr/bin/mongo localhost:8888/admin -u admin -p 123456
 			mongos> sh.status()