`
stationxp
  • 浏览: 17258 次
社区版块
存档分类
最新评论

【MongoDB】3、分布式MongoDB

 
阅读更多

1、概念
(1)分片Sharding
一个sharding通常是一个replica set副本集。
一个sharding不止一台服务器,每台服务器有完全相同的数据子集的副本。


(2)片键shard key
片键是一个或多个field的组合。mongodb负责分片之间负载的均衡,不均衡的时候会移动数据。

根据片键划分sharding,分成连续区间。

选择shard key:有专门的最佳实践,和最差实践,自己找文档看吧。
(3)块Chunk
一个区间的数据称为一个数据库块Chunk。
chunk变大时,mongodb会将其分割为两个较小的块。
启动时可以设置 --chunkSize N,单位是MB。
> db.stats()
{
"db" : "admin",
"collections" : 4,
"objects" : 14,
"avgObjSize" : 116.57142857142857,
"dataSize" : 1632,
"storageSize" : 32768,
"numExtents" : 4,
"indexes" : 3,
"indexSize" : 24528,
"fileSize" : 67108864,
"nsSizeMB" : 16,
"dataFileVersion" : {
"major" : 4,
"minor" : 5
},
"extentFreeList" : {
"num" : 0,
"totalSize" : 0
},
"ok" : 1
}
>

插入100w个文档试试。

for (i=0;i<1000000;++i){
db.larged.insert({"_id":i,"date":new Date(),"title":"哈哈哈",x:i,y:1000000-i});
}


>
> db.stats()
{
"db" : "cgdc",
"collections" : 4,
"objects" : 1000008,
"avgObjSize" : 111.99968000255998,
"dataSize" : 112000576,
"storageSize" : 174759936,
"numExtents" : 15,
"indexes" : 2,
"indexSize" : 27929216,
"fileSize" : 469762048,
"nsSizeMB" : 16,
"dataFileVersion" : {
"major" : 4,
"minor" : 5
},
"extentFreeList" : {
"num" : 0,
"totalSize" : 0
},
"ok" : 1
}
>
(4)mongos
作为mongos集群的代理人存在,调度后面的mongod节点。
mongos有路由、调度的作用,可以有多个。

mongos进程并不持久保存任何数据。
集群的配置信息保存在一组专门的mongod上。被称为config server。所有的config server同时在线才能完成数据迁移。

(5)集群的构成
集群三组件:mongos(路由)、shardings(存储)、config servers(配置)。

mongos经常运行在应用服务上,或手机的客户端。


2、搭建

(1)启动三个实例
3个mongod,同一个服务器,不同端口,不同dbpath,不同logpath。
写三个配置文件吧。


drwxr-xr-x 4 root root 4096 May 17 06:24 mongodb1/
-rw-r--r-- 1 gd gd 121 May 17 06:36 mongodb1.config
drwxr-xr-x 2 root root 4096 May 17 06:33 mongodb2/
-rw-r--r-- 1 gd gd 121 May 17 06:36 mongodb2.config
drwxr-xr-x 2 root root 4096 May 17 06:33 mongodb3/
-rw-r--r-- 1 gd gd 121 May 17 06:36 mongodb3.config
配置文件3的内容:
logpath = /data/mongodb3.log
dbpath = /data/mongodb3
port = 27003
nohttpinterface = false
fork = true
auth = false


把auth关了。
logappend = true


启动三个mongod实例:
gd@ubuntu:/data$ sudo /usr/bin/mongod -f /data/mongodb1.config
about to fork child process, waiting until server is ready for connections.
forked process: 3030
child process started successfully, parent exiting
gd@ubuntu:/data$ sudo /usr/bin/mongod -f /data/mongodb2.config
about to fork child process, waiting until server is ready for connections.
forked process: 3045
child process started successfully, parent exiting
gd@ubuntu:/data$ sudo /usr/bin/mongod -f /data/mongodb3.config
about to fork child process, waiting until server is ready for connections.
forked process: 3060
child process started successfully, parent exiting
gd@ubuntu:/data$


记下pid:
gd@ubuntu:/data$ ps -e | grep mongod
3030 ? 00:00:01 mongod
3045 ? 00:00:07 mongod
3060 ? 00:00:03 mongod


看下日志,确认都是OK的。
recover : no journal files present, no recovery needed 这个是什么意思?
2做了preallocate的事情3.42s
3也是。


(2)启动mongos
然后是mongos了,先看下help
Options:


General options:
-h [ --help ] show this usage information
--version show version information
-f [ --config ] arg configuration file specifying additional options
-v [ --verbose ] [=arg(=v)] be more verbose (include multiple times for more
verbosity e.g. -vvvvv)
--quiet quieter output
--port arg specify port number - 27017 by default
--bind_ip arg comma separated list of ip addresses to listen on
- all local ips by default
--maxConns arg max number of simultaneous connections - 1000000
by default
--logpath arg log file to send write to instead of stdout - has
to be a file, not directory
--syslog log to system's syslog facility instead of file
or stdout
--syslogFacility arg syslog facility used for monogdb syslog message
--logappend append to logpath instead of over-writing
--timeStampFormat arg Desired format for timestamps in log messages.
One of ctime, iso8601-utc or iso8601-local
--pidfilepath arg full path to pidfile (if not set, no pidfile is
created)
--keyFile arg private key for cluster authentication
--setParameter arg Set a configurable parameter
--httpinterface enable http interface
--clusterAuthMode arg Authentication mode used for cluster
authentication. Alternatives are
(keyFile|sendKeyFile|sendX509|x509)
--nounixsocket disable listening on unix sockets
--unixSocketPrefix arg alternative directory for UNIX domain sockets
(defaults to /tmp)
--fork fork server process


Sharding options:
--configdb arg 1 or 3 comma separated config servers
--localThreshold arg ping time (in ms) for a node to be considered local
(default 15ms)
--test just run unit tests
--upgrade upgrade meta data version
--chunkSize arg maximum amount of data per chunk
--ipv6 enable IPv6 support (disabled by default)
--jsonp allow JSONP access via http (has security implications)
--noscripting disable scripting engine


对于我们的情况:
sudo /usr/bin/mongos --port 27000 --logpath /data/mongos.log --httpinterface --configdb localhost:27001,localhost:27002,localhost:27003
启用了httpinterface,没有使用fork,因为想仔细看看。生产环境两个都应该做相反的设置。
设置了logpath,命令行就没有输出了,也不返回,好吧,还是启用fork。
gd@ubuntu:~$ sudo /usr/bin/mongos --port 27000 --logpath /data/mongos.log --fork --httpinterface --configdb localhost:27001,localhost:27002,lo
calhost:27003
about to fork child process, waiting until server is ready for connections.
forked process: 3237
child process started successfully, parent exiting
日志:
2014-05-17T07:01:30.260-0700 [mongosMain] MongoS version 2.6.1 starting: pid=3237 port=27000 64-bit host=ubuntu (--help for usage)
2014-05-17T07:01:30.260-0700 [mongosMain] db version v2.6.1
2014-05-17T07:01:30.260-0700 [mongosMain] git version: 4b95b086d2374bdcfcdf2249272fb552c9c726e8
2014-05-17T07:01:30.260-0700 [mongosMain] build info: Linux build14.nj1.10gen.cc 2.6.32-431.3.1.el6.x86_64 #1 SMP Fri Jan 3 21:39:27 UTC 2014 x86_64 BOOST_LIB_VERSION=1_49
2014-05-17T07:01:30.260-0700 [mongosMain] allocator: tcmalloc
2014-05-17T07:01:30.260-0700 [mongosMain] options: { net: { http: { enabled: true }, port: 27000 }, processManagement: { fork: true }, sharding: { configDB: "localhost:27001,localhost:27002,localhost:27003" }, systemLog: { destination: "file", path: "/data/mongos.log" } }
2014-05-17T07:01:30.268-0700 [mongosMain] SyncClusterConnection connecting to [localhost:27001]
2014-05-17T07:01:30.268-0700 [mongosMain] SyncClusterConnection connecting to [localhost:27002]
2014-05-17T07:01:30.271-0700 [mongosMain] SyncClusterConnection connecting to [localhost:27003]
2014-05-17T07:01:30.274-0700 [mongosMain] creating WriteBackListener for: localhost:27001 serverID: 000000000000000000000000
2014-05-17T07:01:30.275-0700 [mongosMain] creating WriteBackListener for: localhost:27002 serverID: 000000000000000000000000
2014-05-17T07:01:30.276-0700 [mongosMain] creating WriteBackListener for: localhost:27003 serverID: 000000000000000000000000
2014-05-17T07:01:30.289-0700 [mongosMain] scoped connection to localhost:27001,localhost:27002,localhost:27003 not being returned to the pool
2014-05-17T07:01:30.296-0700 [mongosMain] waiting for connections on port 27000
2014-05-17T07:01:30.296-0700 [Balancer] about to contact config servers and shards
2014-05-17T07:01:30.296-0700 [Balancer] SyncClusterConnection connecting to [localhost:27001]
2014-05-17T07:01:30.297-0700 [websvr] admin web console waiting for connections on port 28000
2014-05-17T07:01:30.297-0700 [Balancer] SyncClusterConnection connecting to [localhost:27002]
2014-05-17T07:01:30.298-0700 [Balancer] SyncClusterConnection connecting to [localhost:27003]
2014-05-17T07:01:30.298-0700 [Balancer] config servers and shards contacted successfully
2014-05-17T07:01:30.299-0700 [Balancer] balancer id: ubuntu:27000 started at May 17 07:01:30
2014-05-17T07:01:30.300-0700 [Balancer] SyncClusterConnection connecting to [localhost:27001]
2014-05-17T07:01:30.300-0700 [Balancer] SyncClusterConnection connecting to [localhost:27002]
2014-05-17T07:01:30.301-0700 [Balancer] SyncClusterConnection connecting to [localhost:27003]
2014-05-17T07:01:30.310-0700 [Balancer] SyncClusterConnection connecting to [localhost:27001]
2014-05-17T07:01:30.310-0700 [Balancer] SyncClusterConnection connecting to [localhost:27002]
2014-05-17T07:01:30.310-0700 [LockPinger] creating distributed lock ping thread for localhost:27001,localhost:27002,localhost:27003 and process ubuntu:27000:1400335290:1804289383 (sleeping for 30000ms)
2014-05-17T07:01:30.311-0700 [Balancer] SyncClusterConnection connecting to [localhost:27003]
2014-05-17T07:01:30.625-0700 [LockPinger] cluster localhost:27001,localhost:27002,localhost:27003 pinged successfully at Sat May 17 07:01:30 2014 by distributed lock pinger 'localhost:27001,localhost:27002,localhost:27003/ubuntu:27000:1400335290:1804289383', sleeping for 30000ms
2014-05-17T07:01:30.662-0700 [Balancer] distributed lock 'balancer/ubuntu:27000:1400335290:1804289383' acquired, ts : 53776bbae347c47323c028fc
2014-05-17T07:01:30.823-0700 [Balancer] distributed lock 'balancer/ubuntu:27000:1400335290:1804289383' unlocked.
2014-05-17T07:01:37.135-0700 [Balancer] distributed lock 'balancer/ubuntu:27000:1400335290:1804289383' acquired, ts : 53776bc0e347c47323c028fd
2014-05-17T07:01:37.279-0700 [Balancer] distributed lock 'balancer/ubuntu:27000:1400335290:1804289383' unlocked.
2014-05-17T07:01:43.709-0700 [Balancer] distributed lock 'balancer/ubuntu:27000:1400335290:1804289383' acquired, ts : 53776bc7e347c47323c028fe


通过http://192.168.57.129:28000/也可以进行监视。
可以看到日志,但连接点开都是403,是bug,还是哪里没做对?


(3)数据复制了吗?
看看3个mongod里的数据吧。
gd@ubuntu:/data$ sudo /usr/bin/mongo localhost:27001
> db.stats()
{
"db" : "cgdc",
"collections" : 4,
"objects" : 1000008,
"avgObjSize" : 111.99968000255998,
"dataSize" : 112000576,
"storageSize" : 174759936,
"numExtents" : 15,
"indexes" : 2,
"indexSize" : 27929216,
"fileSize" : 469762048,
"nsSizeMB" : 16,
"dataFileVersion" : {
"major" : 4,
"minor" : 5
},
"extentFreeList" : {
"num" : 0,
"totalSize" : 0
},
"ok" : 1
}
>


gd@ubuntu:/data$ sudo /usr/bin/mongo localhost:27002
MongoDB shell version: 2.6.1
connecting to: localhost:27002/test
> show dbs
admin (empty)
config 0.078GB
local 0.078GB
>
彼此之间没有复制数据。


从mongos看一下
gd@ubuntu:/data$ sudo /usr/bin/mongo localhost:27000
MongoDB shell version: 2.6.1
connecting to: localhost:27000/test
mongos> show dbs
admin 0.063GB
config 0.063GB
mongos> use admin
switched to db admin
mongos> show collections
system.indexes
system.users
system.version
mongos> db.system.users.find()
{ "_id" : "admin.admin", "user" : "admin", "db" : "admin", "credentials" : { "MONGODB-CR" : "d90749c715135e08f60ef9a6fb264f5a" }, "roles" : [ { "role" : "userAdminAnyDatabase", "db" : "admin" } ] }
{ "_id" : "admin.root", "user" : "root", "db" : "admin", "credentials" : { "MONGODB-CR" : "a184e338b317a41aa9a4e0321488692a" }, "roles" : [ { "role" : "root", "db" : "admin" } ] }
{ "_id" : "cgdc.cgdc", "user" : "cgdc", "db" : "cgdc", "credentials" : { "MONGODB-CR" : "714ab4fd93e3012475de54246a444516" }, "roles" : [ { "role" : "dbOwner", "db" : "cgdc" } ] }
{ "_id" : "test.test", "user" : "test", "db" : "test", "credentials" : { "MONGODB-CR" : "a6de521abefc2fed4f5876855a3484f5" }, "roles" : [ { "role" : "dbOwner", "db" : "test" } ] }
mongos>


用户被复制了?还是路由到了1上?
gd@ubuntu:/data$ sudo /usr/bin/mongo localhost:27002
MongoDB shell version: 2.6.1
connecting to: localhost:27002/test
> use admin
switched to db admin
> db.system.users.find();
>


关了1、2留着3
再连mongos报错,说找不到1。
(4)副本集
连上mongos的admin
db.runCommand({"addShard":"CgdcSet/localhost:27001,localhost:27002,localhost:27003","name":"CgdcShard"});
couldn't connect to new shard socket exception
搞不懂怎么用,换个方法,先创建副本集。


a、把服务器都关掉,加 --replSet选项重新启动。
b、执行初始化
报错:member has data already,cannot initicate set.
清理2、3,重新来,成功:
Config noew saved locally.Should come online in about a minutes.
提示符也变成了:
CgdcReplSet:PRIMARY>
分别看看另外两个节点。
数据已经复制了,提示符是:
CgdcReplSet:SECONDARY>
查看数据报错:not master and saveOK=false。
改下配置应该就好了。


副本集中节点有三种类型:standard、passive、arbiter。


现在把master节点关掉试试。
节点2成功成为master。
再把节点1加回来。


副本集OK了。




(5)回到mongos,这时再
sudo /usr/bin/mongos --port 27000 --logpath /data/mongos.log --httpinterface --configdb localhost:27001,localhost:27002,localhost:27003
报错:not master


sudo /usr/bin/mongos --port 27000 --logpath /data/mongos.log --httpinterface --configdb localhost:27001
报同样错误。


只连接2,成功,但失去了意义。mongos的作用是同时连多个,实现分片路由,只有1台,咋路由?


(6)再次添加分片
刚才把cgdc库加到CgdcShard了,再次添加,出错。
删了库,再加,OK。
所以:先见分片,再创建数据,blabla。


至此,圆满了。


mongos知道怎么用了。
Shard知道怎么用了。
replSet知道怎么用了。


我觉得是这样的:
开始的时候一个分片就O了,这个分片对应一个ReplSet,ReplSet里放3台服务器。
硬盘不够使了,或者内存不够使了,总之吧,要扩容,这时候再买三台,配一个ReplSet,创建Shard2。
两个分片了,mongos就有用武之地了,负责根据shard key路由。


再增加分片,mongos一定要重启吗?
再看看文档。


配置问题基本解决,还有些auth之类的事情,第二次过文档的时候再说。




接下来还有两大块内容“3、使用”和“4、管理”,明天再说。











分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics