The problem is that, if you enable cman on boot, the fenced node will try to join the cluster, fail to reach it's peer after post_join_delay (default 6 seconds, iirc) and fence it's peer. That peer reboots, starts cman, tries to connect, fenced it's peer...I my case however, the node which is trying to join is fully operational and has network access. Also if you look at the configuration that I had in my original email, my post_join_delay is 360 (for testing purposes), so there is no way that a timeout occurs. I might be wrong here, but judging from corosync's log file, the other node even joins the cluster successfully, before being marked for fencing by dlm_controld: Sep 11 11:14:09 corosync [CLM ] CLM CONFIGURATION CHANGE Sep 11 11:14:09 corosync [CLM ] New Configuration: Sep 11 11:14:09 corosync [CLM ] r(0) ip(10.xx.xx.1) Sep 11 11:14:09 corosync [CLM ] r(0) ip(10.xx.xx.2) Sep 11 11:14:09 corosync [CLM ] Members Left: Sep 11 11:14:09 corosync [CLM ] Members Joined: Sep 11 11:14:09 corosync [CLM ] r(0) ip(10.xx.xx.2) Sep 11 11:14:09 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed. Sep 11 11:14:09 corosync [QUORUM] Members[2]: 1 2 Sep 11 11:14:09 corosync [QUORUM] Members[2]: 1 2 |
-- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster