On Wed, 2008-06-04 at 14:47 +0200, Alain Moulle wrote: > Hi > > About my problem of node entering a loop : > Jun 3 15:54:49 s_sys@xn2 qdiskd[22256]: <notice> Writing eviction notice for node 1 > Jun 3 15:54:50 s_sys@xn2 qdiskd[22256]: <notice> Node 1 evicted > Jun 3 15:54:51 s_sys@xn2 qdiskd[22256]: <crit> Node 1 is undead. > > I notice that just before entering this loop, I have a message : > Jun 3 15:54:47 s_sys@xn2 fenced[22327]: fencing node "xn1" > Jun 3 15:54:48 s_sys@xn2 qdiskd[22256]: <info> Assuming master role > > but never the message : > Jun 3 15:54:47 s_sys@xn2 fenced[22327]: fence "xn1" success > > Nethertheless, the service of xn1 is well failovered by xn2, but > then after the reboot of xn1, we can't start again the CS5 due > to the problem of infernal loop "Node is undead" on xn2. > > whereas when it works correctly, both messages : > fencing node "xn1" > fence "xn1" success > are successive (after about 30s) > > So my question is : could this pb of infernal loop "Node is undead" > be systematically due to a failed fencing phase of xn2 towards xn1 ? > > PS: note that I have applied patch : > http://sources.redhat.com/git/?p=cluster.git;a=commit;h=b2686ffe984c517110b949d604c54a71800b67c9 Yes. If qdiskd thinks the node is dead and the node started writing to the disk again (which is what fencing should prevent), it will display those messages. -- Lon -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster