Hi, I'm having "exactly" the same problem with some clusters (Version: cman-2.0.84-2.el5_2.2,..) Is it so that if you reboot the node that was killed, it will rejoin the cluster without being killed? And does it only happen if you start the whole cluster from scratch? I didn't figure out the whole picture behind it but I think it is related to IGMP,openais and cman. At least it fells like the same behaviour I'm experiencing. Somehow it seems to be related to the networkswitches and IGMP Version being used (I don't have it on all RHEL5 clusters but on the majority running RHEl5U2+). I'm still investigating on this issue. Strange thing. Marc. On Friday 09 January 2009 11:47:02 Alain.Moulle wrote: > Hi > > Release : cman-2.0.95-1.el5 > (but same problem with 2.0.98) > > I face a problem when launching cman on a two-node cluster : > > 1. Launching cman on node 1 : OK > 2. When launching cman on node 2, the log on node1 gives : > cman killed by node 2 because we rejoined the cluster without a full > restart > > Any idea ? knowing that my cluster.conf is likewise (note the use of gfs > if it could > be linked to ...) : > <?xml version="1.0"?> > <cluster config_version="4" name="HA_TEST"> > <fence_daemon clean_start="1" post_fail_delay="0" > post_join_delay="60"/> > <clusternodes> > <clusternode name="node1" nodeid="1" votes="1"> > <fence> > <method name="1"> > <device name="node1fence" > option="reboot"/> > </method> > </fence> > </clusternode> > <clusternode name="node2" nodeid="2" votes="1"> > <fence> > <method name="1"> > <device name="node2fence" > option="reboot"/> > </method> > </fence> > </clusternode> > </clusternodes> > <cman cluster_id="0" expected_votes="1" two_node="1"/> > <fencedevices> > <fencedevice agent="fence_ipmilan" ipaddr="12.1.1.80" > login="administrator" name="node1fence" passwd="administrator"/> > <fencedevice agent="fence_ipmilan" ipaddr="12.1.1.81" > login="administrator" name="node2fence" passwd="administrator"/> > </fencedevices> > <rm> > <failoverdomains> > <failoverdomain name="MgmtNodes" ordered="0" > restricted="0"> > <failoverdomainnode name="node1" > priority="1"/> > <failoverdomainnode name="node2" > priority="2"/> > </failoverdomain> > </failoverdomains> > <service domain="MgmtNodes" name="HA_MGMT" autostart="0" > recovery="relocate"> > <ip address="10.0.0.65/8" > monitor_link="1"/> <ip address="172.16.118.118/24" > monitor_link="1"/> > <clusterfs > device="LABEL=HA_MGMT:ganglia" mountpoint="/var/lib/ganglia/rrds" > force_unmount="0" fstype="gfs2" name="nfsha2" options=""/> > <script file="/usr/sbin/haservices" > name="haservices"/> > </service> > </rm> > <logging syslog_facility="daemon"/> > <totem token="21000"/> > </cluster> > > Thanks > Regards > Alain Moullé -- Gruss / Regards, Marc Grimme Phone: +49-89 452 3538-14 http://www.atix.de/ http://www.open-sharedroot.org/ ATIX Informationstechnologie und Consulting AG | Einsteinstrasse 10 | 85716 Unterschleissheim | www.atix.de | www.open-sharedroot.org ------------------------------------------------------------ *** Besuchen Sie uns auf dem ATIX IT Solution Day: Linux Cluster-Technolgien, am 05.02.2009 in Neuss b. Koeln/Duesseldorf! www.atix.de/event-archiv/atix-it-solution-day-linux-neuss *** ------------------------------------------------------------ Registergericht: Amtsgericht Muenchen, Registernummer: HRB 168930, USt.-Id.: DE209485962 | Vorstand: Marc Grimme, Mark Hlawatschek, Thomas Merz (Vors.) | Vorsitzender des Aufsichtsrats: Dr. Martin Buss -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster