Alain.Moulle wrote: > Hi, > > I'm facing again this problem of Node evicted and Node is undead ... > And I really don't know what to do ... below are the traces in syslog. > My version is :RHEL5.3 / cman-2.0.98-1.el5 > > Feb 25 14:33:33 s_sys@xn3 qdiskd[27582]: <notice> Writing eviction > notice for node 2 > Feb 25 14:33:34 s_sys@xn3 qdiskd[27582]: <notice> Node 2 evicted > Feb 25 14:33:35 s_sys@xn3 qdiskd[27582]: <crit> Node 2 is undead. > ... etc. > Feb 25 14:33:45 s_sys@xn3 qdiskd[27582]: <crit> Node 2 is undead. > Feb 25 14:33:45 s_sys@xn3 qdiskd[27582]: <alert> Writing eviction notice > for node 2 > Feb 25 14:33:46 s_sys@xn3 qdiskd[27582]: <crit> Node 2 is undead. > Feb 25 14:33:46 s_sys@xn3 qdiskd[27582]: <alert> Writing eviction notice > for node 2 > Feb 25 14:33:47 s_kernel@xn3 kernel: dlm: closing connection to node 2 > Feb 25 14:33:47 s_sys@xn3 fenced[27785]: xn4 not a cluster member after > 0 sec post_fail_delay > Feb 25 14:33:47 s_sys@xn3 fenced[27785]: fencing node "xn4" > Feb 25 14:33:47 s_sys@xn3 qdiskd[27582]: <crit> Node 2 is undead. > ...etc. > Feb 25 14:33:52 s_sys@xn3 qdiskd[27582]: <alert> Writing eviction notice > for node 2 > Feb 25 14:33:52 s_sys@xn3 fenced[27785]: fence "xn4" success > Feb 25 14:33:53 s_sys@xn3 qdiskd[27582]: <crit> Node 2 is undead. > Feb 25 14:33:53 s_sys@xn3 qdiskd[27582]: <alert> Writing eviction notice > for node 2 > Feb 25 14:33:54 s_sys@xn3 qdiskd[27582]: <crit> Node 2 is undead. > Feb 25 14:33:54 s_sys@xn3 qdiskd[27582]: <alert> Writing eviction notice > for node 2 > Feb 25 14:33:54 s_sys@xn3 clurgmgrd[27990]: <notice> Taking over service > service:lustre_xn4 from down member xn4 > Feb 25 14:33:55 s_sys@xn3 qdiskd[27582]: <crit> Node 2 is undead. > .. etc. > > An then after reboot of xn4 , when we try to start the CS on xn4, it > can't enter in the cluster, and we > must stop CS on both nodes and start on both sides again. > > Where could this problem come from ? How can I avoid this eviction of > node ? > > Any help would be very appreciated . You haven't posted any cman/openais messages but it's quite possible you've hit this bug: https://bugzilla.redhat.com/show_bug.cgi?id=485026 There's a patch included and some links to fixed RPMs. Chrissie -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster