Hi,
I noticed an issue with rrp mode active with one ring working (0, cross cable) and the other not (1, link exists, connected to different vlans, for testing) The transport is udpu (cman+pacemaker+corosync) There are many links related to this issue but non seems to be fixed in corosync 1.4.1 nor corosync-1.4.2, e.g., http://www.gossamer-threads.com/lists/linuxha/pacemaker/77388 Running corosync-cfgtool -s swaps between no faults and [1 of 3] # corosync-cfgtool -s Printing ring status. Local node ID 2 RING ID 0 id = 1.0.0.12 status = ring 0 active with no faults RING ID 1 id = 1.0.0.2 status = ring 1 active with no faults # corosync-cfgtool -s Printing ring status. Local node ID 2 RING ID 0 id = 1.0.0.12 status = ring 0 active with no faults RING ID 1 id = 1.0.0.2 status = Incrementing problem counter for seqid 958 iface 1.0.0.2 to [1 of 3] Jan 23 08:22:32 corosync [TOTEM ] Incrementing problem counter for seqid 898 iface 1.0.0.2 to [1 of 3] Jan 23 08:22:34 corosync [TOTEM ] ring 1 active with no faults Jan 23 08:22:36 corosync [TOTEM ] Incrementing problem counter for seqid 900 iface 1.0.0.2 to [1 of 3] Jan 23 08:22:38 corosync [TOTEM ] ring 1 active with no faults Jan 23 08:22:41 corosync [TOTEM ] Incrementing problem counter for seqid 902 iface 1.0.0.2 to [1 of 3] Jan 23 08:22:43 corosync [TOTEM ] ring 1 active with no faults Jan 23 08:22:45 corosync [TOTEM ] Incrementing problem counter for seqid 904 iface 1.0.0.2 to [1 of 3] Jan 23 08:22:47 corosync [TOTEM ] ring 1 active with no faults Jan 23 08:22:49 corosync [TOTEM ] Incrementing problem counter for seqid 906 iface 1.0.0.2 to [1 of 3] Jan 23 08:22:51 corosync [TOTEM ] ring 1 active with no faults Jan 23 08:22:54 corosync [TOTEM ] Incrementing problem counter for seqid 908 iface 1.0.0.2 to [1 of 3] Jan 23 08:22:56 corosync [TOTEM ] ring 1 active with no faults cluster.conf <?xml version="1.0"?> <cluster config_version="3" name="node"> <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3" skip_undefined="1"/> <totem rrp_mode="active"/> <clusternodes> <clusternode name="node-1" nodeid="1" votes="1"> <altname name="local1sync2"/> <fence/> </clusternode> <clusternode name="node-2" nodeid="2" votes="1"> <altname name="local2sync2"/> <fence/> </clusternode> </clusternodes> <cman expected_votes="1" keyfile="/etc/cluster/corosync.authkey" transport="udpu" two_node="1"/> <fencedevices/> <rm/> </cluster> The traffic on both sides seems as expected corosync-1.4.2-1fs.el6.i686 corosynclib-1.4.2-1fs.el6.i686 pacemaker-1.1.6-3.el6.i686 pacemaker-libs-1.1.6-3.el6.i686 pacemaker-cluster-libs-1.1.6-3.el6.i686 pacemaker-cli-1.1.6-3.el6.i686 cman-3.0.12.1-23.el6.i686 Centos 6.2 Linux 2.6.32-220.2.1.el6.i686 #1 SMP Thu Dec 22 18:50:52 GMT 2011 i686 i686 i386 GNU/Linux Again, the issue is the ring is not detected as fauty NOTE: Changing rrp mode to passive seems to workaround the issue status = Marking ringid 1 interface 1.0.0.2 FAULTY Thanks, Oren |
_______________________________________________ discuss mailing list discuss@xxxxxxxxxxxx http://lists.corosync.org/mailman/listinfo/discuss