Re: [PATCH] loop: take the file system minimum dio alignment into account

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 1/28/25 00:22, Christoph Hellwig wrote:
> The loop driver currently uses the logical block size of the underlying
> bdev as the lower bound of the loop device block size.  While this works
> for many cases, it fails for file systems made up of multiple devices
> with different logic block size (e.g. XFS with a RT device that has a
> larger logical block size), or when the file systems doesn't support
> direct I/O writes at the sector size granularity (e.g. because it does
> out of place writes with a file system block size larger than the sector
> size).
> 
> Fix this by querying the minimum direct I/O alignment from statx when
> available.
> 
> Signed-off-by: Christoph Hellwig <hch@xxxxxx>
> ---
>  drivers/block/loop.c | 58 ++++++++++++++++++++++++--------------------
>  1 file changed, 32 insertions(+), 26 deletions(-)

[...]

> +static void loop_query_min_dio_size(struct loop_device *lo)
> +{
> +	struct file *file = lo->lo_backing_file;
> +	struct block_device *sb_bdev = file->f_mapping->host->i_sb->s_bdev;
> +	struct kstat st;
> +
> +	if (!vfs_getattr(&file->f_path, &st, STATX_DIOALIGN, 0) &&
> +	    (st.result_mask & STATX_DIOALIGN))
> +		lo->lo_min_dio_size = st.dio_offset_align;

Nit: Maybe return here to avoid the else below and the comment block in the
middle of a if-else-if ?

> +	/*
> +	 * In a perfect world this wouldn't be needed, but as of Linux 6.13 only
> +	 * a handful of file systems support the STATX_DIOALIGN flag.
> +	 */
> +	else if (sb_bdev)
> +		lo->lo_min_dio_size = bdev_logical_block_size(sb_bdev);
> +	else
> +		lo->lo_min_dio_size = SECTOR_SIZE;
> +}

[...]

> -static unsigned int loop_default_blocksize(struct loop_device *lo,
> -		struct block_device *backing_bdev)
> +static unsigned int loop_default_blocksize(struct loop_device *lo)
>  {
> -	/* In case of direct I/O, match underlying block size */
> -	if ((lo->lo_backing_file->f_flags & O_DIRECT) && backing_bdev)
> -		return bdev_logical_block_size(backing_bdev);
> +	/* In case of direct I/O, match underlying minimum I/O size */
> +	if (lo->lo_backing_file->f_flags & O_DIRECT)
> +		return lo->lo_min_dio_size;

I wonder if conditionally returning lo_min_dio_size only for O_DIRECT is
correct. What if the loop dev is first started without direct IO and gets a
block size of 512B, and then later restarted with direct IO and now gets a block
size of say 4K ? That could break a file system that is already on the device
and formatted using the initial smaller block size. So shouldn't we simply
always return lo_min_dio_size here, regardless of if O_DIRECT is used or not ?

>  	return SECTOR_SIZE;
>  }
>  
> @@ -993,7 +998,7 @@ static void loop_update_limits(struct loop_device *lo, struct queue_limits *lim,
>  		backing_bdev = inode->i_sb->s_bdev;
>  
>  	if (!bsize)
> -		bsize = loop_default_blocksize(lo, backing_bdev);
> +		bsize = loop_default_blocksize(lo);
>  
>  	loop_get_discard_config(lo, &granularity, &max_discard_sectors);
>  
> @@ -1090,6 +1095,7 @@ static int loop_configure(struct loop_device *lo, blk_mode_t mode,
>  	lo->lo_backing_file = file;
>  	lo->old_gfp_mask = mapping_gfp_mask(mapping);
>  	mapping_set_gfp_mask(mapping, lo->old_gfp_mask & ~(__GFP_IO|__GFP_FS));
> +	loop_query_min_dio_size(lo);
>  
>  	lim = queue_limits_start_update(lo->lo_queue);
>  	loop_update_limits(lo, &lim, config->block_size);


-- 
Damien Le Moal
Western Digital Research




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux