On 12/02/15 23:18, Alexandre DERUMIER wrote:
What is the behavior of mongo when a shard is unavailable for some reason (crash or network partition) ? If shard3 is on the wrong side of a network partition and uses RBD, it will hang. Is it something that mongo will >>gracefully handle ?
If one shard is down, I think the cluster is locked.
That's why I thinked to add corosync/pacemaker to restart a mongod daemon on another host, migrate a vip, keeping the same /dev/rbd3 (as it can be shared on all nodes) for example.
A little bit complex, but this mongodb replication is really buggy on high load. (Need to implement librados inside mongo ;)
I wonder if it might be better to let Mongo do the replication (since
that is what it understands) - so you'd use rbd volumes in pool(s) with
replica size 1 (i.e no replication) its storage, and create n Mongo
replicasets for each shard. That way a shard down will just be a
degradation alert rather than fatal.
Cheers
Mark
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com