CS5 / about loop "Node is undead"

Alain Moulle <Alain.Moulle@xxxxxxxx> · Wed, 04 Jun 2008 14:47:21 +0200

Hi

About my problem of node entering a loop :
Jun  3 15:54:49 s_sys@xn2 qdiskd[22256]: <notice> Writing eviction notice for node 1
Jun  3 15:54:50 s_sys@xn2 qdiskd[22256]: <notice> Node 1 evicted
Jun  3 15:54:51 s_sys@xn2 qdiskd[22256]: <crit> Node 1 is undead.

I notice that just before entering this loop, I have a message :
Jun  3 15:54:47 s_sys@xn2 fenced[22327]: fencing node "xn1"
Jun  3 15:54:48 s_sys@xn2 qdiskd[22256]: <info> Assuming master role

but never the message :
Jun  3 15:54:47 s_sys@xn2 fenced[22327]: fence "xn1" success

Nethertheless, the service of xn1 is well failovered by xn2, but
then after the reboot of xn1, we can't start again the CS5 due
to the problem of infernal loop "Node is undead" on xn2.

whereas when it works correctly, both messages :
fencing node "xn1"
fence "xn1" success
are successive (after about 30s)

So my question is : could this pb of infernal loop "Node is undead"
be systematically due to a failed fencing phase of xn2 towards xn1 ?

PS: note that I have applied patch :
http://sources.redhat.com/git/?p=cluster.git;a=commit;h=b2686ffe984c517110b949d604c54a71800b67c9

Thanks
Regards
Alain Moullé

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster