On Tue, Jul 4, 2023, at 8:56 AM, Jan Kara wrote: > Writing to mounted devices is dangerous and can lead to filesystem > corruption as well as crashes. Furthermore syzbot comes with more and > more involved examples how to corrupt block device under a mounted > filesystem leading to kernel crashes and reports we can do nothing > about. Add tracking of writers to each block device and a kernel cmdline > argument which controls whether writes to block devices open with > BLK_OPEN_BLOCK_WRITES flag are allowed. We will make filesystems use > this flag for used devices. > > Syzbot can use this cmdline argument option to avoid uninteresting > crashes. Also users whose userspace setup does not need writing to > mounted block devices can set this option for hardening. > > Link: > https://lore.kernel.org/all/60788e5d-5c7c-1142-e554-c21d709acfd9@xxxxxxxxxx > Signed-off-by: Jan Kara <jack@xxxxxxx> > --- > block/Kconfig | 16 ++++++++++ > block/bdev.c | 63 ++++++++++++++++++++++++++++++++++++++- > include/linux/blk_types.h | 1 + > include/linux/blkdev.h | 3 ++ > 4 files changed, 82 insertions(+), 1 deletion(-) > > diff --git a/block/Kconfig b/block/Kconfig > index 86122e459fe0..8b4fa105b854 100644 > --- a/block/Kconfig > +++ b/block/Kconfig > @@ -77,6 +77,22 @@ config BLK_DEV_INTEGRITY_T10 > select CRC_T10DIF > select CRC64_ROCKSOFT > > +config BLK_DEV_WRITE_MOUNTED > + bool "Allow writing to mounted block devices" > + default y > + help > + When a block device is mounted, writing to its buffer cache very likely s/very/is very/ > + going to cause filesystem corruption. It is also rather easy to crash > + the kernel in this way since the filesystem has no practical way of > + detecting these writes to buffer cache and verifying its metadata > + integrity. However there are some setups that need this capability > + like running fsck on read-only mounted root device, modifying some > + features on mounted ext4 filesystem, and similar. If you say N, the > + kernel will prevent processes from writing to block devices that are > + mounted by filesystems which provides some more protection from runaway > + priviledged processes. If in doubt, say Y. The configuration can be s/priviledged/privileged/ > + overridden with bdev_allow_write_mounted boot option. s/with/with the/ > +/* open is exclusive wrt all other BLK_OPEN_WRITE opens to the device */ > +#define BLK_OPEN_BLOCK_WRITES ((__force blk_mode_t)(1 << 5)) Bikeshed but: I think BLK and BLOCK "stutter" here. The doc comment already uses the term "exclusive" so how about BLK_OPEN_EXCLUSIVE ?