RE: [Linux-cluster] GFS 6.0 node without quorum tries to fence

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



before I tried with manual fencing I tried this with automatic fencing
(fence_rib). And always mitte was faster and fenced oben and unten. This
means, one faulty node can reboot all other nodes. I think this is not
ok. And even after reboot the problem is not solved, because the faulty
node is still faulty.

A node should only be allowed to fence if it is Master and if it has the
qourum. And never if it is in arbitrating mode.

> -----Original Message-----
> From: linux-cluster-bounces@xxxxxxxxxx 
> [mailto:linux-cluster-bounces@xxxxxxxxxx] On Behalf Of Steve Landherr
> Sent: Dienstag, 3. August 2004 18:23
> To: Discussion of clustering software components including GFS
> Subject: RE: [Linux-cluster] GFS 6.0 node without quorum 
> tries to fence
> 
> 
> In a netsplit, what does fencing achieve when done by a node 
> that doesn't have quorum?  It still won't have quorum.  It 
> should probably just clean up as best it can and leave the 
> rest of the cluster alone.
> 
> -steve
> --
> Steve Landherr -- landherr@xxxxxxxxxx
> Kazeon Systems, Inc.
> Mountain View, California
> 
> -----Original Message-----
> From: linux-cluster-bounces@xxxxxxxxxx 
> [mailto:linux-cluster-bounces@xxxxxxxxxx] On > Behalf Of 
> Michael Conrad Tadpol Tilstra
> Sent: Tuesday, August 03, 2004 9:13 AM
> To: Discussion of clustering software components including GFS
> Subject: Re: [Linux-cluster] GFS 6.0 node without quorum 
> tries to fence
> 
> So looking at what you gave below, mitte was master. (making 
> this guess from the "Core lost slave quorum" part of the 
> message below.)  It knows that it doesn't have quorum, it 
> still is going to try to be the Master. It does not know 
> "that it can not build a cluster."  The only thing it knows 
> right now about the other nodes is that they failed to send 
> heartbeats.  Therefor they must have left the cluter 
> abnormally. Therefor it must fence them.
> 
> The other two nodes see that mitte have failed to reply to 
> heartbeats. Therefor it must have left the cluster 
> abnormally.  Therefor it must be fenced.
> 
> Both sides of the netsplit are trying to resolve things to 
> regain the cluster.  From an outsiders view point (which you 
> and I have, the nodes do not.) We can see that mitte's 
> attempts are futile, oben and unten will get control of the 
> cluter.  But the node cannot see this.
> 
> This is what makes netsplits kind of ugly.  
> 
> (using ifdown to test cluster stuff causes extra confusion in 
> my opinion. because you actually are creating a netsplit 
> case.  Not a simpler node down case.  The power switch is 
> nice for this.)
> 
> 
> I hope that made some sence.
> 
> -- 
> Michael Conrad Tadpol Tilstra
> Blood is thicker than water, and much tastier.
> 
> 
> 
> --
> 
> Linux-cluster@xxxxxxxxxx 
> http://www.redhat.com/mailman/listinfo/linux-> cluster
> 


[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux