Hi, On Wed, Jan 23, 2013 at 12:54 AM, Alan Robertson <alanr@xxxxxxx> wrote: > Hi, > > I have a sudden need to configure corosync for clusters with the > following characteristics - which I have no control over: > unicast > 3-10 nodes > one unbonded 1G interface on one network > one unbonded 10G interface on a different network > Pacemaker > must work smoothly if any single network interface should fail. > Failover and failback must be automatic. > > From perusing the info from mailing lists via Google, it seems that some > versions of Corosync might not do this correctly. > > What is the oldest version of corosync which is known to support such a > configuration reliably? Corosync 1.4.5, but you'd have to build from source as I haven't seen any binaries around yet (I may be mistaken on this one). Also, rrp_mode: passive is the more tested mode, so in terms of reliability, that would be the recommended mode. With different speed links (regardless of active/passive rrp_mode) it will wait for the slower line. Some people have mentioned that the slower speed line sometimes gets marked as faulty and autorecovers, but this behavior hasn't been signaled by anyone in recent Corosync 1.4.x versions, that's why the recommendation for 1.4.5. On the Pacemaker side, I guess it also depends on if you have any shared resources such as DRBD. If the resources are stateless there should be less worries about configuration. For interface failures (and not only that) you have ocf:pacemaker:ping, if physical interfaces' failure is a concern, and you need services that work on top on a VIP address, you can add that to a loopback interface (this is how I remember this being done in the Cisco world). Failover and failback usually mean no explicit resource-stickiness has been defined, and thus Pacemaker's movement of resources would be it fails a resource to another available node, but when the node returns, based on the returning node's lower hostname (as determined by strncmp() ), the resource will failback to the original node. When talking about 10 nodes, you'd have to get a little creative with either location constraints or node attributes (or both) to make resources failover and back to a specific set of nodes. HTH, Dan > > -- > Alan Robertson <alanr@xxxxxxx> - @OSSAlanR > > "Openness is the foundation and preservative of friendship... Let me claim from you at all times your undisguised opinions." - William Wilberforce > _______________________________________________ > discuss mailing list > discuss@xxxxxxxxxxxx > http://lists.corosync.org/mailman/listinfo/discuss -- Dan Frincu CCNA, RHCE _______________________________________________ discuss mailing list discuss@xxxxxxxxxxxx http://lists.corosync.org/mailman/listinfo/discuss