Re: Diskless Quorum Disk

Chris Harms <chris@xxxxxxxxxxx> · Fri, 22 Jun 2007 18:43:53 -0500

Lon, thank you for the response.  It appears that what I thought was a 
fence duel, was actually the cluster fencing the proper node and DRBD 
halting the surviving node after a split brain scenario.  (Have some 
work to do on my drbd.conf obviously.)  After the fenced node revived, 
it saw that the other was unresponsive (it had been halted) and then 
fenced it; in this case inducing it to power on.

Our DRAC shares the NICs with the host.  We will probably hack on the 
DRAC fence script a little to take advantage of some other features 
available besides doing a poweroff poweron.

Using two_node=1 may be an option again, but then the FAQ indicates the 
quorum disk might still be beneficial.  Using a loop device didn't seem 
to go so well, but that could be due to configuration error.  Having one 
node not see the qdisk is probably an automatic test failure.

Thanks again,
Chris

Lon Hohberger wrote:
On Wed, Jun 20, 2007 at 05:57:05PM -0500, Chris Harms wrote:

My nodes were set to "quorum=1 two_node=1" and fenced by DRAC cards 
using telnet over their NICs.  The same NICs used in my bonded config on 
the OS so I assumed it was on the same network path.  Perhaps I assume 
incorrectly.

That sounds mostly right.  The point is that a node disconnected from
the cluster must not be able to fence a node which is supposedly still
connected.

That is: 'A' must not be able to fence 'B' if 'A' becomes disconnected
from the cluster.  However, 'A' must be able to be fenced if 'A' becomes
disconnected.

Why was DRAC unreachable; was it unplugged too? (Is DRAC like IPMI - in
that it shares a NIC with the host machine?)

Desired effect would be survivor claims service(s) running on 
unreachable node and attempts to fence unreachable node or bring it back 
online without fencing should it establish contact.  Actual result was 
survivor spun its wheels trying to fence unreachable node and did not 
assume services.

Yes, this is an unfortunate limitation of using (most) integrated power
management systems.  Basically, some BMCs share a NIC with the host
(IPMI), and some run off of the machine's power supply (IPMI, iLO,
DRAC).  When the fence device becomes unreachable, we don't know whether
it's a total network outage or a "power disconnected" state.

* If the power to a node has been disconnected, it's safe to recover.

* If the node just lost all of its network connectivity, it's *NOT* safe
to recover.

* In both cases, we can not confirm the node is dead... which is why we
don't recover.

Restoring network connectivity induced the previously 
unreachable node to reboot and the surviving node experienced some kind 
of weird power off and then powered back on (???).

That doesn't sound right; the surviving node should have stayed put (not
rebooted).

Ergo I figured I must need quorum disk so I can use something like a 
ping node.  My present plan is to use a loop device for the quorum disk 
device and then setup ping heuristics.  Will this even work, i.e. do the 
nodes both need to see the same qdisk or can I fool the service with a 
loop device?

I don't believe the effect of tricking qdiskd in this way have been
explored; I don't see why it wouldn't work in theory, but... qdiskd with
or without a disk won't fix the behavior you experienced (uncertain
state due to failure to fence -> retry / wait for node to come back).

I am not deploying GFS or GNDB and I have no SAN.  My only 
option would be to add another DRBD partition for this purpose which may 
or may not work.

What is the proper setup option, two_node=1 or qdisk?

In your case, I'd say two_node="1".

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster