Re: details of fencing

Lon Hohberger <lhh@xxxxxxxxxx> · Thu, 12 Jul 2007 14:48:10 -0400

On Thu, Jul 12, 2007 at 03:03:46AM -0500, James Fait wrote:
> I am in the process of implementing clustering for shared data storage
> across a number of nodes, with several nodes exporting large GNBD
> volumes, and also new storage from an iSCSI raid chassis with 6TB of
> storage.  The nature of the application requires that the nodes that
> access the data store are pretty much independent of each other, just
> providing CPU and graphics support while reading several hundred
> megabytes of image data in 32mb chunks, and writing numerous small
> summary files of this data.  Our current methodology, which works but is
> slow, is to server the data by NFS over gigabit ethernet. A similar
> facility nearby, with the same application, has implemented GFS on FC
> equipment, and are using the FC switch for fencing.  As I have somewhat
> different storage hardware and data retention requirements, I need to
> implement different fencing methods.
> 
> The storage network is on a 3com switch, which is able to take down a
> given link via a telnet command, and later restore it.

We don't have an agent for this; you'll have to assemble one.

> Also, each of
> the storage nodes has a Smart UPS with control over the individual
> outlets on the UPS, which could be used for power fencing of the GNBD
> server nodes.  The only issue there is that these are not networked UPS
> systems, but are connected via serial ports to some of the nodes.  On
> the network switch fencing, I am currently using the storage net for
> cluster communications, so bringing down a port also stops cluster
> communications.

Loss of cluster comms is generally what will require I/O fencing in the
first place.  That is, a node being fenced and losing cluster comms is
quite fine.

Note, however, that without power fencing, you can't just turn ports
back on - you need to go in and reboot the nodes that are fenced and
manually re-enable them.

> I know I will probably have to write a fence agent for at least some of
> the parts of this.  The questions that I have are the exact sequence of
> events for fencing a node, as in who initiates the fencing operation,

One node handles fencing if I recall correctly.  I don't remember how
the node is chosen; someone else will have to answer that.

> and what is the sequence of events for recovery and rejoining the
> cluster after a reboot. 

(a) node is fenced
(b) administrator turns node off
(c) administrator unfences node
(d) administrator turns node on

... should work :)

> I currently have a test setup of four nodes
> with a 4TB GNBD export from one of the nodes to the other three, using
> fence_gnbd on those nodes, and fence_gnbd_notify with fence_manual on
> the server, at least until I can get the UPS fence agent working. 

> If I
> need to, I can put the UPS systems on a network terminal server to allow
> any node to connect to the UPS for commands, but would prefer that it
> connect to one of the cluster nodes directly using the serial port. 

The terminal server idea will work better - and be less complex in the
end.  It won't matter which node fences which other node - and in the
same realm of logic, it won't matter which nodes are online at the time
fencing must be done for a particular node.

> For
> the iSCSI chassis, from the manual it appears that I can force a iSCSI
> disconnect via snmp or telnet using the management interface for the
> chassis, which from my reading of the RFQ, should be an effective fence
> for iSCSI, as it will invalidate the current connection from the
> initiator, and requires a re-authentication and negotiation of the link
> before allowing more communications with that node.

Yes, and the node will need to be rebooted.

> Hopefully, this gives enough information to a least get a start on this,
> as it is several issues, each which may need separate followup.

If all of the nodes have power fencing, you can avoid a lot of hassle.
Try to take the simplest path.  Complexity is inversely proportional to
reliability.

How many nodes have UPS access?

-- 
Lon Hohberger - Software Engineer - Red Hat, Inc.

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster