Re: [PATCH V4] block: make segment size limit workable for > 4K PAGE_SIZE

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Feb 19, 2025 at 10:44:09AM +0800, Ming Lei wrote:
> PAGE_SIZE is applied in validating block device queue limits, this way is
> very fragile and is wrong:
> 
> - queue limits are read from hardware, which is often one readonly hardware
> property
> 
> - PAGE_SIZE is one config option which can be changed during build time.
> 
> In RH lab, it has been found that max segment size of some mmc card is
> less than 64K, then this kind of card can't be probed successfully when
> same kernel is re-built with 64K PAGE_SIZE.
> 
> Fix this issue by adding BLK_MIN_SEGMENT_SIZE and lim->min_segment_size:
> 
> - validate segment limits by BLK_MIN_SEGMENT_SIZE which is 4K(minimized PAGE_SIZE)
> 
> - checking if one bvec can be one segment quickly by lim->min_segment_size
> 
> commit 6aeb4f836480 ("block: remove bio_add_pc_page")
> commit 02ee5d69e3ba ("block: remove blk_rq_bio_prep")
> commit b7175e24d6ac ("block: add a dma mapping iterator")

Let me try to help with this commit log message a bit, how about:

Using PAGE_SIZE as a minimum expected DMA segment size in consideration
of devices which have a max DMA segment size of 4k when used on 64k
PAGE_SIZE systems leads to devices not being able to probe such as
eMMC and Exynos UFS controller [0] [1] you can end up with a probe failure
as follows:

WARNING: CPU: 2 PID: 397 at block/blk-settings.c:339 blk_validate_limits+0x364/0x3c0                                                                                           
Modules linked in: mmc_block(+) rpmb_core crct10dif_ce ghash_ce sha2_ce dw_mmc_bluefield sha256_arm64 dw_mmc_pltfm sha1_ce dw_mmc mmc_core nfit i2c_mlxbf sbsa_gwdt gpio_mlxbf2
f_tmfifo dm_mirror dm_region_hash dm_log dm_mod
CPU: 2 UID: 0 PID: 397 Comm: (udev-worker) Not tainted 6.12.0-39.el10.aarch64+64k #1
Hardware name: https://www.mellanox.com BlueField SoC/BlueField SoC, BIOS BlueField:3.5.1-1-g4078432 Jan 28 2021
ng pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)                                                                                                          
pc : blk_validate_limits+0x364/0x3c0                                                                                                                                           
p.service                                                                                                                                                                      
lr : blk_set_default_limits+0x20/0x40                                                                                                                                      
Setup...                                                                                                                                                                       
sp : ffff80008688f2d0                                                                                                                                                          
x29: ffff80008688f2d0 x28: ffff000082acb600 x27: ffff80007bef02a8                                                                                                              
x26: ffff80007bef0000 x25: ffff80008688f58e x24: ffff80008688f450                                                                                                              
x23: ffff80008301b000 x22: 00000000ffffffff x21: ffff800082c39950                                                                                                              
x20: 0000000000000000 x19: ffff0000930169e0 x18: 0000000000000014                                                                                                              
x17: 00000000767472b1 x16: 0000000005a697e6 x15: 0000000002f42ca4                                                                                                              
x11: 00000000de7f0111 x10: 000000005285b53a x9 : ffff800080752908                                                                                                              
x8 : 0000000000000001 x7 : 0000000000000000 x6 : 0000000000000200                                                                                                              
x5 : 0000000000000000 x4 : 000000000000ffff x3 : 0000000000004000                                                                                                              
x2 : 0000000000000200 x1 : 0000000000001000 x0 : ffff80008688f450                                                                                                              
Call trace:                                                                                                                                                                    
 blk_validate_limits+0x364/0x3c0                                                                                                                                               
 blk_set_default_limits+0x20/0x40                                                                                                                                              
 blk_alloc_queue+0x84/0x240                                                                                                                                                    
 blk_mq_alloc_queue+0x80/0x118                                                                                                                                                 
 __blk_mq_alloc_disk+0x28/0x198                                                                                                                                                
 mmc_alloc_disk+0xe0/0x260 [mmc_block]                                                                                                                                         
...                                                                                                                                                                                           
mmcblk mmc0:0001: probe with driver mmcblk failed with error -22  

To fix this provide a block sanity check to ensure we use the min of the
block device's segment size and PAGE_SIZE.

Link: https://lore.kernel.org/linux-block/20230612203314.17820-1-bvanassche@xxxxxxx/ # [0]
Link: https://lore.kernel.org/linux-block/1d55e942-5150-de4c-3a02-c3d066f87028@xxxxxxx/ # [1]

> diff --git a/block/blk-merge.c b/block/blk-merge.c
> index 15cd231d560c..4fe2dfabfc9d 100644
> --- a/block/blk-merge.c
> +++ b/block/blk-merge.c
> @@ -329,7 +329,7 @@ int bio_split_rw_at(struct bio *bio, const struct queue_limits *lim,
>  
>  		if (nsegs < lim->max_segments &&
>  		    bytes + bv.bv_len <= max_bytes &&
> -		    bv.bv_offset + bv.bv_len <= PAGE_SIZE) {
> +		    bv.bv_offset + bv.bv_len <= lim->min_segment_size) {
>  			nsegs++;
>  			bytes += bv.bv_len;
>  		} else {

Now that's certainly more in line with what I expected.

> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index 248416ecd01c..1f7d492975c1 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -367,6 +367,7 @@ struct queue_limits {
>  	unsigned int		max_sectors;
>  	unsigned int		max_user_sectors;
>  	unsigned int		max_segment_size;
> +	unsigned int		min_segment_size;
>  	unsigned int		physical_block_size;
>  	unsigned int		logical_block_size;
>  	unsigned int		alignment_offset;
> @@ -1163,6 +1164,8 @@ static inline bool bdev_is_partition(struct block_device *bdev)
>  enum blk_default_limits {
>  	BLK_MAX_SEGMENTS	= 128,
>  	BLK_SAFE_MAX_SECTORS	= 255,
> +	/* use minimized PAGE_SIZE as min segment size hint */
> +	BLK_MIN_SEGMENT_SIZE	= 4096,
>  	BLK_MAX_SEGMENT_SIZE	= 65536,
>  	BLK_SEG_BOUNDARY_MASK	= 0xFFFFFFFFUL,

I think Bart provided a more sensible comment:

Use 4 KiB as the smallest default supported DMA segment size limit instead of
PAGE_SIZE. This is important if the page size is larger than 4 KiB since
the maximum DMA segment size for some storage controllers is only 4 KiB.

With that:

Reviewed-by: Luis Chamberlain <mcgrof@xxxxxxxxxx>

  Luis




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux