Re: Fenced node never reboots properly

Jeroen van den Horn <J.vandenHorn@xxxxx> · Fri, 30 Mar 2007 16:07:16 +0200

In response to Lon's suggestion I modified the fence_vmware code and
set the type of reset to HARD - cluster node now resets properly.
Remaining issue is that under VMWare we are still experiencing
performance issues. It's as if a node in the cluster starts 'lagging
behind' (also the system clock starts drifting) and that after some
time one of the nodes declares the other dead.

Does anybody have any pointers towards performance issues and/or clock
drifting with GFS on virtual machines?

Regards,

Jeroen

I'm using fence_vmware which I downloaded from some CVS repository.
Good to hear that that is the issue - I'll take a look at the source
and see whether the VMWare API support some sort of 'hard reset'.

Jeroen

Lon Hohberger wrote:

    On Thu, Mar 29, 2007 at 10:04:00AM +0200, Jeroen van den Horn wrote:

      However during shutdown node 2 executes /etc/rc6.d/S31umountnfs (it's a 
Debian system) which also attempts to unmount the GFS disk - result: 
kernel OOPS. The system continues shutdown until it says 'Will now 
restart.' but that's the end of it. I've tried setting the 
/proc/sys/kernel/panic and added 'panic=5' to the kernel boot options 
but to no avail.

I'm really at a loss here - does anybody have any suggestions on how to 
solve this problem?

Yes, it's supposed to be killed (immediately) when fenced, not
gracefully attempting to shut down.  What fencing agent are you using?
It sounds like there's a bug.

-- Lon

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster