Re: qdisk WITHOUT fencing

Gordan Bobic <gordan@xxxxxxxxxx> · Fri, 18 Jun 2010 09:38:04 +0100

On 06/18/2010 07:57 AM, Jankowski, Chris wrote:

Using the analogy you gave, the problem with a mafioso is that he cannot kill
all other mafiosos in the gang when they are all sitting in solitary confinment
cells (:-)).

Do you have a better idea? How do you propose to ensure that there is no 
resource clash when a node becomes intermittent or half-dead? How do you 
prevent it's interference from bringing down the service? What do you 
propose? More importantly, how would you propose to handle this when 
ensuring consistency is of paramount importance, e.g. when using a 
cluster file system?

I would like to remark that this STONITH business causes endless
problems in clusters within a single data centre too. For example a
temporary hiccup on the network that causes short heartbeat failure
triggers all nodes of the cluster to kill the other nodes. And boy,
do they succeed with a typical HP iLO fencing. You can see all your
nodes going down. Then they come back and the shootout continues
essentially indefinitely if fencing works. If not, then they all
block.

If your network is that intermittent, you have bigger problems.
But you can adjust your cman timeout values (<totem token = "[timeout in 
milliseconds]"/>) to something more appropriate to the quality of your 
network.

And all of that is so unnecessary, as a combination of a properly
implemented quorum  disk and SCSI reservations with local boot disks
and data disks on shared storage  could provide quorum maintenance,
split-brain avoidance and protection of the integrity  of the
filesystem.

I disagree. If a note starts to go wrong, it cannot be trusted to not 
trash the file system, ignoring quorums and suchlike. Data integrity is 
too important to take that risk.

DEC ASE cluster on Ultrix and MIPS hardware had that in 1991. You do
not  even need GFS2, although it is very nice to have a real cluster
filesystem.

If you want something that's looser than a proper cluster FS without the 
need for fencing (and are happy to live with the fact that when 
splitbrain occurs, one of the files will win and the other copies _will_ 
get trashed, you may want to look into GlusterFS if you haven't already.

By the way, I believe that commercial stretched cluster on Linux is
not possible if you rely on LVM for distributed storage. Linux LVM
is architecturally incapable of providing any resilience over
distance, IMHO. It is missing the plex and subdisk layers as in
Veritas LVM and has no notion of location, so you it cannot tell
which piece of storage is in which data centre. The only volume
manager that I know that has this feature is in OpenVMS.  Perhaps
the latest Veritas has it too.

I never actually found a purpose for LVM that cannot be done away with 
if you apply a modicum of forward planning (something that seems to be 
becoming quite rare in most industries these days). There are generally 
better ways than LVM to achieve the things that LVM is supposed to do.

One could use distributed storage arrays of the type of HP P4000
(bought with Left Hand Networks). This shifts the problem from the
OS to the storage vendor.

What distributed storage would you use in a hypothetical stretched
cluster?

Depends on what exactly your use-case is. In most use-cases, properly 
distributed storage (a-la CleverSafe) comes with too much of a 
performance penalty to be useful when geographically dispersed. The 
single most defining measure of performance of a system is access time 
latencies. When caching gets difficult and your ping times move from LAN 
(slow) to WAN (ridiculous), performance generally becomes completely 
unworkable.

Gordan

Gordan

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster