Re: Linux Plumbers MD BOF discussion notes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Oct 01 2017, Mikael Abrahamsson wrote:

> On Mon, 18 Sep 2017, NeilBrown wrote:
>
>> Anyway, thanks for the example of a real problem related to this.  It 
>> does make it easier to think about.
>
> Btw, if someone does --zero-superblock or dd /dev/zero to to a component 
> device that is active, what happens when mdadm --stop /dev/mdX is run? 
> Does it write out the complete superblock again?

--zero-superblock won't work on a device that is currently part of an
array.  dd /dev/zero will.
When the array is stopped the metadata will be written if the array is
not read-only and is not clean.
So for 'linear' and 'raid0' it is never written.  For others it probably
is but may not be.
I'm not sure that forcing a write makes sense.  A dd could corrupt lots
of stuff, and just saving the metadata is not a big win.

I've been playing with some code, and this patch makes it impossible to
write to a device which is in-use by md.
Well... not exactly.  If a partition is in-use by md, the whole device
can still be written to.  But the partition itself cannot.
Also if metadata is managed by user-space, writes are still allowed.
To fix that, we would need to capture each write request and validate
the sector range.  Not impossible, but ugly.

Also, by itself, this patch breaks the use of raid6check on an active
array.  We could fix that by enabling writes whenever a region is
suspended.

Still... maybe it is a starting point for thinking about the problem.

NeilBrown


diff --git a/drivers/md/md.c b/drivers/md/md.c
index 0ff1bbf6c90e..7c469cd9febc 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -2264,6 +2264,7 @@ static int lock_rdev(struct md_rdev *rdev, dev_t dev, int shared)
 		pr_warn("md: could not open %s.\n", __bdevname(dev, b));
 		return PTR_ERR(bdev);
 	}
+	bdev->bd_holder_only_writes = !shared;
 	rdev->bdev = bdev;
 	return err;
 }
@@ -2272,6 +2273,7 @@ static void unlock_rdev(struct md_rdev *rdev)
 {
 	struct block_device *bdev = rdev->bdev;
 	rdev->bdev = NULL;
+	bdev->bd_holder_only_writes = 0;
 	blkdev_put(bdev, FMODE_READ|FMODE_WRITE|FMODE_EXCL);
 }
 
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 93d088ffc05c..673b71bac731 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -1816,10 +1816,14 @@ void blkdev_put(struct block_device *bdev, fmode_t mode)
 		WARN_ON_ONCE(--bdev->bd_contains->bd_holders < 0);
 
 		/* bd_contains might point to self, check in a separate step */
-		if ((bdev_free = !bdev->bd_holders))
+		if ((bdev_free = !bdev->bd_holders)) {
+			bdev->bd_holder_only_writes = 0;
 			bdev->bd_holder = NULL;
-		if (!bdev->bd_contains->bd_holders)
+		}
+		if (!bdev->bd_contains->bd_holders) {
+			bdev->bd_contains->bd_holder_only_writes = 0;
 			bdev->bd_contains->bd_holder = NULL;
+		}
 
 		spin_unlock(&bdev_lock);
 
@@ -1884,8 +1888,13 @@ ssize_t blkdev_write_iter(struct kiocb *iocb, struct iov_iter *from)
 	loff_t size = i_size_read(bd_inode);
 	struct blk_plug plug;
 	ssize_t ret;
+	struct block_device *bdev = I_BDEV(bd_inode);
 
-	if (bdev_read_only(I_BDEV(bd_inode)))
+	if (bdev_read_only(bdev))
+		return -EPERM;
+	if (bdev->bd_holder != NULL &&
+	    bdev->bd_holder_only_writes &&
+	    bdev->bd_holder != file)
 		return -EPERM;
 
 	if (!iov_iter_count(from))
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 339e73742e73..79e3a2822867 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -424,6 +424,7 @@ struct block_device {
 	void *			bd_holder;
 	int			bd_holders;
 	bool			bd_write_holder;
+	bool			bd_holder_only_writes;
 #ifdef CONFIG_SYSFS
 	struct list_head	bd_holder_disks;
 #endif

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux