On Sat, Mar 23, 2024 at 05:11:19PM +0100, Christian Brauner wrote: > Last kernel release we introduce CONFIG_BLK_DEV_WRITE_MOUNTED. By > default this option is set. When it is set the long-standing behavior > of being able to write to mounted block devices is enabled. > > But in order to guard against unintended corruption by writing to the > block device buffer cache CONFIG_BLK_DEV_WRITE_MOUNTED can be turned > off. In that case it isn't possible to write to mounted block devices > anymore. > > A filesystem may open its block devices with BLK_OPEN_RESTRICT_WRITES > which disallows concurrent BLK_OPEN_WRITE access. When we still had the > bdev handle around we could recognize BLK_OPEN_RESTRICT_WRITES because > the mode was passed around. Since we managed to get rid of the bdev > handle we changed that logic to recognize BLK_OPEN_RESTRICT_WRITES based > on whether the file was opened writable and writes to that block device > are blocked. That logic doesn't work because we do allow > BLK_OPEN_RESTRICT_WRITES to be specified without BLK_OPEN_WRITE. > > So fix the detection logic. Use O_EXCL as an indicator that > BLK_OPEN_RESTRICT_WRITES has been requested. We do the exact same thing > for pidfds where O_EXCL means that this is a pidfd that refers to a > thread. For userspace open paths O_EXCL will never be retained but for > internal opens where we open files that are never installed into a file > descriptor table this is fine. > > Note that BLK_OPEN_RESTRICT_WRITES is an internal only flag that cannot > directly be raised by userspace. It is implicitly raised during > mounting. > > Passes xftests and blktests with CONFIG_BLK_DEV_WRITE_MOUNTED set and > unset. > > Fixes: 321de651fa56 ("block: don't rely on BLK_OPEN_RESTRICT_WRITES when yielding write access") > Reported-by: Matthew Wilcox <willy@xxxxxxxxxxxxx> > Link: https://lore.kernel.org/r/ZfyyEwu9Uq5Pgb94@xxxxxxxxxxxxxxxxxxxx > Signed-off-by: Christian Brauner <brauner@xxxxxxxxxx> So v1 of this patch works fine. I just got round to testing v2, and it does not. Indeed, applying 2/2 causes root to fail to mount: /dev/root: Can't open blockdev List of all bdev filesystems: ext3 ext2 ext4 xfs Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(254,0) Applying only 1/2 boots but fails to fix the bug.