--- On Wed, 5/5/10, Yehuda Sadeh Weinraub <yehudasa@xxxxxxxxx> wrote: > The problem is that the ceph monitors require a quorum in > order to decide on the cluster state. The way the system > works right now, a 2-way monitor setup would be less stable > than a system with a single monitor since it wouldn't work > whenever any of the two monitors crashes. Right, that is indeed not nice. :) > A possible workaround would be to have a special case for a > 2-way mon clusters, where it'd require a single mon for > getting a majority. I'm not sure whether this is actually > feasible. As usual, the devil is in the details. Yes. One simple way is to use a ping node. If a node can reach the ping node, but not its peer, it should be able to assume "lone operation" and thus effectively degrade to a single monitor situation temporarily. I guess my question is, "is this something that the ceph project is potentially willing to support for OSDs?" I suspect that also supporting dynamic reconfiguration: http://en.wikipedia.org/wiki/Paxos_algorithm#Cheap_Paxos would also help a great deal to make clusters more adaptable. > > One suggestion I have would be to do this would be to > > use some of the same techniques that heartbeat uses to > > determine whether a node has gone down or if instead there > > is network segregation: a serial port connection, common > > ping nodes (such as a router)... > There is a heartbeat mechanism withing the mon cluster, and > it's being used for the monitors to keep track of their peer > status. It might be a good idea to add different configurable > types of heartbeats. Yes, specifically, I meant by using some of the techniques that the heartbeat project uses: http://www.linux-ha.org/wiki/Heartbeat Ideally (my suggestion,) they would make some of them available in a library so that other projects like RADOS could use them independently without having to rewrite them from scratch. > > 2) Is there any way of preventing two users of an RBD > > device from using the device concurrently? ... > > We were just thinking about the proper solution to this > problem ourselves. There are a few options. One is to > add some kinds of locking mechanism to the osd, which > would allow doing just that. E.g., a client would take > a lock, do whatever it needs to do, a second client > would try to get the lock but will be able to hold it only > after the first one has released it. Another option would > be to have the clients handle the mutual exclusion > themselves (hence not enforced by the osd) by setting > flags and leases on the rbd header. I'm curious, do you mean a scheme such as writing the name of the node "locking" the image along with a timestamp regularly to the header as a heartbeat? Along with some lock acquisition logic? Thanks for the replies! -Martin -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html