MongoDB Reference

The core components in the MongoDB package are: mongod – the core database process; mongos – the controller and query router for sharded clusters; and mongo the interactive MongoDB Shell.

For more information http://docs.mongodb.org/manual/reference/program/

help – This will output a list of basic commands available in the MongoDB shell. For help with methods that operate on a database, you will use the db.help() method. Typing help into the MongoDB shell outputs the following:

db.help(): Help on db methods

db.mycoll.help(): Help on collection methods

sh.help(): Sharding helpers

rs.help(): Replica set helpers

help admin: Administrative help

help connect: Connecting to a db help

help keys: Key shortcuts

help misc: Misc things to know

help mr: Mapreduce

show dbs: Show database names

show collections: Show collections in current database

show users: Show users in current database

show profile: Show most recent system.profile entries with time s>= 1 m

show logs: Show accessible logger names

show log [name]: Prints out last segment of log in memory; global is default

use <db_name>: Set current database

db.foo.find(): List objects in the foo collection

db.foo.find( { a : 1 } ): List objects in foo where a == 1

it: Result of the last line evaluated; use to further iterate

DBQuery.shellBatchSize = x: Set default number of items to display on shell

exit: Quit Mongo shell

Some of the most important commands for gathering info from a database are the commands that begin with show. For example, showdbs will give you a list of the currently accessible database names on the system. showcollections will list the collections in the current database. To retrieve current database, simply type db and the shell will output the name of it.

List of commands when you run —help

--shell run the shell after executing files
--nodb don’t connect to mongod on startup – no ‘db address’ arg expected
--norc will not run the “.mongorc.js” file on start up
--quiet be less chatty
--port arg port to connect to
--host argserver to connec tto
--eval argevaluate javascript
h [ --help ]show this usage information
--versionshow version information
--verboseincrease verbosity
--ipv6enable IPv6 support (disabled by default)
--disableJavaScriptJITdisable the Javascript Just In Time compiler
-- disableJavaScriptProtection
allow automatic JavaScript function marshalling
--ssluse SSL for all connections
--sslCAFile argCertificate Authority file for SSL
--sslPEMKeyFile argPEM certificate/key file for SSL
--sslPEMKeyPassword argpassword for key in PEM file for SSL
--sslCRLFile argCertificate Revocation List file for SSL
--sslAllowInvalidHostnamesallow connections to servers with non-matching hostnames
--sslAllowInvalidCertificatesallow connections to servers with invalid certificates
--sslFIPSModeactivate FIPS 140-2 mode at startup
--retryWrites automatically retry write operations upon transient network errors
--jsHeapLimitMB arg set the js scope’s heap size limit

Authentication Options

-u [ --username ] argusername for authentication
-p [ --password ] argpassword for authentication
--authenticationDatabase arguser source (defaults to dbname)
--authenticationMechanism argauthentication mechanism
--gssapiServiceName arg (=mongodb)Service name to use when authenticating using GSSAPI/Kerberos
--gssapiHostName argRemote host name to use for purpose of GSSAPI/Kerberos authentication

Difference Between Sharding and Replication on MongoDB

If there is data of 75 GB then by replication (3 servers), it will store 75GB data on each servers means 75GB on Server-1, 75GB on server-2 and 75GB on server-3..(correct me if i am wrong)..and by sharding it will be stored as 25GB data on server-1, 25Gb data on server-2 and 25GB data on server-3.(Right?)

Replica-Set means that you have multiple instances of MongoDB which each mirror all the data of each other. A replica-set consists of one Master (also called “Primary”) and one or more Slaves (aka Secondary). Read-operations can be served by any slave, so you can increase read-performance by adding more slaves to the replica-set (provided that your client application is capable to actually use different set-members). But write-operations always take place on the master of the replica-set and are then propagated to the slaves, so writes won’t get faster when you add more slaves.

Replica-sets also offer fault-tolerance. When one of the members of the replica-set goes down, the others take over. When the master goes down, the slaves will elect a new master. For that reason it is suggested for productive deployment to always use MongoDB as a replica-set of at least three servers, two of them holding data (the third one is a data-less “arbiter” which is required for determining a new master when one of the slaves goes down).

Sharded Cluster means that each shard of the cluster (which can also be a replica-set) takes care of a part of the data. Each request, both reads and writes, is served by the cluster where the data resides. This means that both read- and write performance can be increased by adding more shards to a cluster. Which document resides on which shard is determined by the shard key of each collection. It should be chosen in a way that the data can be evenly distributed on all clusters and so that it is clear for the most common queries where the shard-key resides (example: when you frequently query by user_name, your shard-key should include the field user_name so each query can be delegated to only the one shard which has that document).

The drawback is that the fault-tolerance suffers. When one shard of the cluster goes down, any data on it is inaccessible. For that reason each member of the cluster should also be a replica-set. This is not required. When you don’t care about high-availability, a shard can also be a single mongod instance without replication. But for production-use you should always use replication.

So what does that mean for your example?

When you want to split your data of 75GB into 3 shards of 25GB each, you need at least 6 database servers organized in three replica-sets. Each replica-set consists of two servers who have the same 25GB of data.

You also need servers for the arbiters of the three replica-sets as well as the mongos router and the config server for the cluster. The arbiters are very lightweight and are only needed when a replica-set member goes down, so they can usually share the same hardware with something else. But Mongos router and config-server should be redundant and on their own servers.
Source – https://dba.stackexchange.com/questions/52632/difference-between-sharding-and-replication-on-mongodb/53705#53705

Leave a Comment