On Tue, Dec 22 2015, Adam Goryachev wrote: > On 22/12/15 09:03, NeilBrown wrote: >> On Tue, Dec 22 2015, Tejas Rao wrote: >> >>> On 12/21/2015 15:47, NeilBrown wrote: >>>> On Tue, Dec 22 2015, Tejas Rao wrote: >>>> >>>>> What if the application is doing the locking and making sure that only 1 >>>>> node writes to a md device at a time? Will this work? How are rebuilds >>>>> handled? This would be helpful with distributed filesystems like >>>>> GPFS/lustre etc. >>>>> >>>> You would also need to make sure that the filesystem only wrote from a >>>> single node at a time (or access the block device directly). I doubt >>>> GPFS/lustre make any promise like that, but I'm happy to be educated. >>>> >>>> rebuilds are handled by using a cluster-wide lock to block all writes to >>>> a range of addresses while those stripes are repaired. >>>> >>>> NeilBrown > > My understanding of MD level cross host RAID was that it would not > magically create cluster aware filesystems out of non-cluster aware > filesystems. ie, you wouldn't be able to use the same multi-host RAID > device on multiple hosts concurrently with ext3. This is correct. The expectation is that clustered md/raid1 would be used with a cluster-aware filesystem such as ocfs2 or gpfs. Certainly not with ext3 or similar. > > IMHO, if it was able to behave similar to DRBD, then that would be > perfect (ie, enforce only a single node can write at a time (unless you > specifically set it for multi-node write)). The benefit should be that > you can lose a node without losing your data. After you lose that node, > you can then "do something" to use the remaining node to access the data > (eg, mount it, export with iscsi/nfs, etc). There is a lot of similarity between DRBD and clustered md/raid1. I don't know the current state of DRBD but it initially assumed each storage device was local to a single node and so sent data over the network (i.e. over IP) to "remote" devices. clustered md/raid1 assumes that all storage is equally accessible to all nodes (over a 'storage area network', which may still be IP). So yes: if you lose a node you should not lose functionality. > > Currently, this is what I use DRBD for, previously, I've used NBD + MD > RAID1 to do the same thing. One question though is what advantage > multi-host MD RAID might have over the existing in-kernel DRBD ? Are > there plans which show why this is going to be better, have better > performance, features, etc? I'm not the driving force behind clustered md/raid1 so I am not completely familiar with the motivation, but I believe DRBD doesn't, or didn't, make best possible use of the storage network when every storage device is connected to every compute node. It is expected that clustered md/raid1 will. I *think* DRBD is primarily for pair of nodes (though there is some multi-node support). clustered md/raid1 is designed to work with multiple nodes - however big your cluster is. (DRBD 9.0 appears to support multi-node configurations. I haven't researched the details) NeilBrown
Attachment:
signature.asc
Description: PGP signature