This series introduces a proposal to implementing atomic writes in the kernel for torn-write protection. This series takes the approach of adding a new "atomic" flag to each of pwritev2() and iocb->ki_flags - RWF_ATOMIC and IOCB_ATOMIC, respectively. When set, these indicate that we want the write issued "atomically". Only direct IO is supported and for block devices here. For this, atomic write HW is required, like SCSI ATOMIC WRITE (16). I plan to send a series for supporting atomic writes for XFS later this week, but initially only for XFS rtvol. Updated man pages have been posted at: https://lore.kernel.org/lkml/20240124112731.28579-1-john.g.garry@xxxxxxxxxx/T/#m520dca97a9748de352b5a723d3155a4bb1e46456 The goal here is to provide an interface that allows applications use application-specific block sizes larger than logical block size reported by the storage device or larger than filesystem block size as reported by stat(). With this new interface, application blocks will never be torn or fractured when written. For a power fail, for each individual application block, all or none of the data to be written. A racing atomic write and read will mean that the read sees all the old data or all the new data, but never a mix of old and new. Three new fields are added to struct statx - atomic_write_unit_min, atomic_write_unit_max, and atomic_write_segments_max. For each atomic individual write, the total length of a write must be a between atomic_write_unit_min and atomic_write_unit_max, inclusive, and a power-of-2. The write must also be at a natural offset in the file wrt the write length. For pwritev2, iovcnt is limited by atomic_write_segments_max. SCSI sd.c and scsi_debug and NVMe kernel support is added. This series is based on v6.8-rc1. Changes since v2: - Support atomic_write_segments_max - Limit atomic write paramaters to max_hw_sectors_kb - Don't increase fmode_t - Change value for RWF_ATOMIC - Various tidying (including advised by Jan) Changes since v1: - Drop XFS support for now - Tidy NVMe changes and also add checks for atomic write violating max AW PF length and boundary (if any) - Reject - instead of ignoring - RWF_ATOMIC for files which do not support atomic writes - Update block sysfs documentation - Various tidy-ups Alan Adamson (2): nvme: Support atomic writes nvme: Ensure atomic writes will be executed atomically Himanshu Madhani (2): block: Add atomic write operations to request_queue limits block: Add REQ_ATOMIC flag John Garry (9): block: Limit atomic writes according to bio and queue limits block: Pass blk_queue_get_max_sectors() a request pointer block: Limit atomic write IO size according to atomic_write_max_sectors block: Error an attempt to split an atomic write bio block: Add checks to merging of atomic writes block: Add fops atomic write support scsi: sd: Support reading atomic write properties from block limits VPD scsi: sd: Add WRITE_ATOMIC_16 support scsi: scsi_debug: Atomic write support Prasad Singamsetty (2): fs/bdev: Add atomic write support info to statx fs: Add RWF_ATOMIC and IOCB_ATOMIC flags for atomic write support Documentation/ABI/stable/sysfs-block | 52 +++ block/bdev.c | 37 +- block/blk-merge.c | 94 ++++- block/blk-mq.c | 2 +- block/blk-settings.c | 103 +++++ block/blk-sysfs.c | 33 ++ block/blk.h | 9 +- block/fops.c | 44 +- drivers/nvme/host/core.c | 71 ++++ drivers/nvme/host/nvme.h | 2 + drivers/scsi/scsi_debug.c | 589 +++++++++++++++++++++------ drivers/scsi/scsi_trace.c | 22 + drivers/scsi/sd.c | 93 ++++- drivers/scsi/sd.h | 8 + fs/stat.c | 47 ++- include/linux/blk_types.h | 2 + include/linux/blkdev.h | 45 +- include/linux/fs.h | 12 + include/linux/stat.h | 3 + include/scsi/scsi_proto.h | 1 + include/trace/events/scsi.h | 1 + include/uapi/linux/fs.h | 5 +- include/uapi/linux/stat.h | 9 +- 23 files changed, 1123 insertions(+), 161 deletions(-) -- 2.31.1