[PATCH 1/2] block: handle BLK_OPEN_RESTRICT_WRITES correctly

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Last kernel release we introduce CONFIG_BLK_DEV_WRITE_MOUNTED. By
default this option is set. When it is set the long-standing behavior
of being able to write to mounted block devices is enabled.

But in order to guard against unintended corruption by writing to the
block device buffer cache CONFIG_BLK_DEV_WRITE_MOUNTED can be turned
off. In that case it isn't possible to write to mounted block devices
anymore.

A filesystem may open its block devices with BLK_OPEN_RESTRICT_WRITES
which disallows concurrent BLK_OPEN_WRITE access. When we still had the
bdev handle around we could recognize BLK_OPEN_RESTRICT_WRITES because
the mode was passed around. Since we managed to get rid of the bdev
handle we changed that logic to recognize BLK_OPEN_RESTRICT_WRITES based
on whether the file was opened writable and writes to that block device
are blocked. That logic doesn't work because we do allow
BLK_OPEN_RESTRICT_WRITES to be specified without BLK_OPEN_WRITE.

So fix the detection logic. Use O_EXCL as an indicator that
BLK_OPEN_RESTRICT_WRITES has been requested. We do the exact same thing
for pidfds where O_EXCL means that this is a pidfd that refers to a
thread. For userspace open paths O_EXCL will never be retained but for
internal opens where we open files that are never installed into a file
descriptor table this is fine.

Note that BLK_OPEN_RESTRICT_WRITES is an internal only flag that cannot
directly be raised by userspace. It is implicitly raised during
mounting.

Passes xftests and blktests with CONFIG_BLK_DEV_WRITE_MOUNTED set and
unset.

Fixes: 321de651fa56 ("block: don't rely on BLK_OPEN_RESTRICT_WRITES when yielding write access")
Reported-by: Matthew Wilcox <willy@xxxxxxxxxxxxx>
Link: https://lore.kernel.org/r/ZfyyEwu9Uq5Pgb94@xxxxxxxxxxxxxxxxxxxx
Signed-off-by: Christian Brauner <brauner@xxxxxxxxxx>
---
 block/bdev.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/block/bdev.c b/block/bdev.c
index 7a5f611c3d2e..f819f3086905 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -821,13 +821,12 @@ static void bdev_yield_write_access(struct file *bdev_file)
 		return;
 
 	bdev = file_bdev(bdev_file);
-	/* Yield exclusive or shared write access. */
-	if (bdev_file->f_mode & FMODE_WRITE) {
-		if (bdev_writes_blocked(bdev))
-			bdev_unblock_writes(bdev);
-		else
-			bdev->bd_writers--;
-	}
+
+	/* O_EXCL is only set for internal BLK_OPEN_RESTRICT_WRITES. */
+	if (bdev_file->f_flags & O_EXCL)
+		bdev_unblock_writes(bdev);
+	else if (bdev_file->f_mode & FMODE_WRITE)
+		bdev->bd_writers--;
 }
 
 /**
@@ -946,6 +945,13 @@ static unsigned blk_to_file_flags(blk_mode_t mode)
 	else
 		WARN_ON_ONCE(true);
 
+	/*
+	 * BLK_OPEN_RESTRICT_WRITES is never set from userspace and
+	 * O_EXCL is stripped from userspace.
+	 */
+	if (mode & BLK_OPEN_RESTRICT_WRITES)
+		flags |= O_EXCL;
+
 	if (mode & BLK_OPEN_NDELAY)
 		flags |= O_NDELAY;
 
-- 
2.43.0





[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux