On Mon, 3 Nov 2014 14:01:10 -0700 James Simmons <uja.ornl@xxxxxxxxx> wrote: > Hello. > > This is a patch against the latest kernel source which is based on > a patch used by Lustre. The below describes what we are trying to > achieve. I like to get a feedback if this is the right approach. > > ---------------------------------------------------------------------- > > The ext4 MMP block reads always need to get fresh data from the > underlying disk. Otherwise, if a remote node is updating the MMP > block and the reads are fetched from the MD RAID5 stripe cache, > it is possible that the local node will incorrectly decide the > remote node has died and allow the filesystem to be mounted on > two nodes at the same time. It is preferred for patches to be inline, rather than as attachments, as it makes it easier to comment on them.... diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index 9c66e59..11b749c 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -2678,6 +2678,9 @@ static int add_stripe_bio(struct stripe_head *sh, struct bio *bi, int dd_idx, in } if (sector >= sh->dev[dd_idx].sector + STRIPE_SECTORS) set_bit(R5_OVERWRITE, &sh->dev[dd_idx].flags); + } else if (bi->bi_rw & REQ_NOCACHE) { + /* force to read from underlying disk if requested */ + clear_bit(R5_UPTODATE, &sh->dev[dd_idx].flags); } pr_debug("added bi b#%llu to stripe s#%llu, disk %d.\n", This doesn't provide a useful guarantee. If the device that stores that block has failed, the md/raid5 will read all other devices to recover the block. If that recently happened and you just clear the UPTODATE bit on the block, md/raid5 will recover the data from all the other blocks, without reading them. But considering this at a higher level: if two different nodes try to assemble the same RAID5 array then you already potentially have a problem. You really want some sensible cluster co-ordinator and let it make these decisions. Hoping the a block device can be a reliable semaphore seems ... misguided. NeilBrown
Attachment:
pgpouMRBeacqr.pgp
Description: OpenPGP digital signature
-- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel