On 1/28/25 00:22, Christoph Hellwig wrote: > The loop driver currently uses the logical block size of the underlying > bdev as the lower bound of the loop device block size. While this works > for many cases, it fails for file systems made up of multiple devices > with different logic block size (e.g. XFS with a RT device that has a > larger logical block size), or when the file systems doesn't support > direct I/O writes at the sector size granularity (e.g. because it does > out of place writes with a file system block size larger than the sector > size). > > Fix this by querying the minimum direct I/O alignment from statx when > available. > > Signed-off-by: Christoph Hellwig <hch@xxxxxx> > --- > drivers/block/loop.c | 58 ++++++++++++++++++++++++-------------------- > 1 file changed, 32 insertions(+), 26 deletions(-) [...] > +static void loop_query_min_dio_size(struct loop_device *lo) > +{ > + struct file *file = lo->lo_backing_file; > + struct block_device *sb_bdev = file->f_mapping->host->i_sb->s_bdev; > + struct kstat st; > + > + if (!vfs_getattr(&file->f_path, &st, STATX_DIOALIGN, 0) && > + (st.result_mask & STATX_DIOALIGN)) > + lo->lo_min_dio_size = st.dio_offset_align; Nit: Maybe return here to avoid the else below and the comment block in the middle of a if-else-if ? > + /* > + * In a perfect world this wouldn't be needed, but as of Linux 6.13 only > + * a handful of file systems support the STATX_DIOALIGN flag. > + */ > + else if (sb_bdev) > + lo->lo_min_dio_size = bdev_logical_block_size(sb_bdev); > + else > + lo->lo_min_dio_size = SECTOR_SIZE; > +} [...] > -static unsigned int loop_default_blocksize(struct loop_device *lo, > - struct block_device *backing_bdev) > +static unsigned int loop_default_blocksize(struct loop_device *lo) > { > - /* In case of direct I/O, match underlying block size */ > - if ((lo->lo_backing_file->f_flags & O_DIRECT) && backing_bdev) > - return bdev_logical_block_size(backing_bdev); > + /* In case of direct I/O, match underlying minimum I/O size */ > + if (lo->lo_backing_file->f_flags & O_DIRECT) > + return lo->lo_min_dio_size; I wonder if conditionally returning lo_min_dio_size only for O_DIRECT is correct. What if the loop dev is first started without direct IO and gets a block size of 512B, and then later restarted with direct IO and now gets a block size of say 4K ? That could break a file system that is already on the device and formatted using the initial smaller block size. So shouldn't we simply always return lo_min_dio_size here, regardless of if O_DIRECT is used or not ? > return SECTOR_SIZE; > } > > @@ -993,7 +998,7 @@ static void loop_update_limits(struct loop_device *lo, struct queue_limits *lim, > backing_bdev = inode->i_sb->s_bdev; > > if (!bsize) > - bsize = loop_default_blocksize(lo, backing_bdev); > + bsize = loop_default_blocksize(lo); > > loop_get_discard_config(lo, &granularity, &max_discard_sectors); > > @@ -1090,6 +1095,7 @@ static int loop_configure(struct loop_device *lo, blk_mode_t mode, > lo->lo_backing_file = file; > lo->old_gfp_mask = mapping_gfp_mask(mapping); > mapping_set_gfp_mask(mapping, lo->old_gfp_mask & ~(__GFP_IO|__GFP_FS)); > + loop_query_min_dio_size(lo); > > lim = queue_limits_start_update(lo->lo_queue); > loop_update_limits(lo, &lim, config->block_size); -- Damien Le Moal Western Digital Research