Hi Lon, Lon Hohberger ha scritto:
Do they crash (panic), or do they just become totally unresponsive?
One server suddenly becomes unresponsive, like frozen. The second server starts to miss heartbeats from the first. At the moment I have configured manual fencing so the service is not relocated (more explained below). If I remember good restarting the locked machine is not enough, I have to reboot the working one too.
Have you tried getting a stack trace from the console using sysrq? (echo 1 > /proc/sys/kernel/sysrq; then hit alt-sysrq-t from the console).
No I haven't, I will try this thing too.
One thing that's peculiar is that - if they are locking up, they have to be locking up at about the same time -- otherwise, one would fence the other, and life would go on.
As I wrote only one gets locked. The fencing configuration is another problem to me and something I am aware of. I haven't understood very well how it works, looks like I need an external device which manages power. In this case which device and consequently fencing method is more suitable? I am rather confused about this argument.
Fabrizio -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster