Re: GNBD -> MD -> GFS?

Jayson Vantuyl <jvantuyl@xxxxxxxxxxxxxx> · Tue, 22 May 2007 03:27:46 -0700

MD is a problem.  Since it's not cluster aware, you can have a situation where two atomic writes aren't actually atomic.
Specifically, there are times when a write to disk need to either happen or not happen.  Often this write spans more than a few blocks.  With clustered locking, you get a guarantee that two writers won't attempt to hit the same data at the same time.  However, with Linux MD, it's just not that simple.  

Assume that you have blocks A and B that both must be written atomically.  If two nodes try to write them simultaneously, it is possible for block A to be a copy from one node but block B to be from a different one due to timing.  Things get even hairier when MD is rebuilding a RAID, as both machines will both attempt to remirror the system!  Madness ensues.

This is the reason that Redhat has put together the cmirror target.  CMirror will do a simple RAID1 style DM target across a cluster.  There is also CSnap for doing snapshotting.  There isn't anything like CRAID5.  Unfortunately, this is all a bit bleeding edge and not so integrated into the stack you want to use (CLVM would need to be in there for one).

On May 22, 2007, at 12:02 AM, Nathaniel Eliot wrote:

I've been looking for a Linux clustered file system that can mirror/stripe
across storage servers.  The end goal is to build a HA cluster from just FOSS
and commodity parts.  I've found two existing possibilities, each with drawbacks:

1) GPFS can do redundant storage, but isn't FOSS.
2) DRBD + GNBD + GFS is free, but can only do two active nodes.

I'm considering putting software RAID between GNBD and GFS on each storage node.
 The superblocks would be retained, so each MD would theoretically have the same
view of data.  GFS is cluster aware, and GNBD built for clusters; the big
question is, can they work sandwiching the non-cluster-aware MD?

Any quick links/explanations as to why this won't work, and/or useful
suggestions, would be appreciated.  Thanks,

-- 
Nathaniel Eliot
T9 Productions

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

 -- 
Jayson Vantuyl
Systems Architect
Engine Yard
jvantuyl@xxxxxxxxxxxxxx

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster