patch to support dynamic membership

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



hello, Corosync community,

Let me start with a quick introduction to set up the context.

I was looking into ways to utilize Corosync/Pacemaker stack for creating a high-availability cluster of Redis servers with automatic failover. Using Corosync 1.4.1 as a messaging layer and a stateful master/slave Resource Agent with Pacemaker, it was fairly straight-forward to set up. Things work pretty well for a static cluster - where membership is defined up front. However, we needed to be able to seamlessly add new machines to the cluster and remove existing ones from it, without service interruption, which is where we ran into a problem.

We use Amazon EC2 to host our servers. One of the specifics about EC2 is that they do not allow IP multicast, so the only Corosync transport available to us was UDP unicast. The issue with using UDP unicast is that you need to specify all your members in the configuration file, which is unfortunately not possible in EC2, where server instances launch and then acquire IP addresses via DHCP. A server can also get a new address on reboot, or when its DHCP lease expires. So, what we really wanted was a dynamic membership: where servers can join and leave and the cluster will react accordingly to it. 

We have scripts and outside logic to monitor the state of the cluster - in other words, we know when servers were joining and leaving. So initially, we took an approach of updating the corosync.conf file and restarting corosync service on all machines in the cluster. Unfortunately that wasn't good enough, because it meant that Pacemaker services needed to be restarted as well, and therefore all clustered resources under Pacemaker's control had to be restarted too, which meant interruption of service. 

Our second approach was to patch Corosync to be able to modify membership at runtime, without the need for restart. With guidance and helpful advice from Steve Drake, we were able to do that. 


And now, to the patch itself.

We are using a new db object to keep track of dynamic members, called "totem.interface.dynamic". You create and delete child objects of that object, using corosync-objctl utility:

to add new member:
corosync-objctl -c totem.interface.dynamic.10-211-55-12

to delete an existing member:
corosync-objctl -d totem.interface.dynamic.10-211-55-12


Then the logic in main.c reacts to those events accordingly, by adding a member to the member list, or removing an existing one from it. (The object names are basically IP addresses with dots replaced by dashes, because the dot is used as separator in object hierarchy.)

The patch was successfully tested with Corosync versions 1.4.1 and 1.4.2.


Thanks.
Anton.



Attachment: corosync-1.4.2-dynamic_members.patch
Description: Binary data


_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss

[Index of Archives]     [Linux Clusters]     [Corosync Project]     [Linux USB Devel]     [Linux Audio Users]     [Photo]     [Yosemite News]    [Yosemite Photos]    [Linux Kernel]     [Linux SCSI]     [X.Org]

  Powered by Linux