On Thu, Feb 06, 2025 at 02:27:16PM +0100, Darrick J. Wong wrote: > On Thu, Feb 06, 2025 at 07:00:55PM +0000, da.gomez@xxxxxxxxxx wrote: > > From: Daniel Gomez <da.gomez@xxxxxxxxxxx> > > > > In patch [1] ("bdev: use bdev_io_min() for statx block size"), block > > devices will now report their preferred minimum I/O size for optimal > > performance in the stx_blksize field of the statx data structure. This > > change updates the current default 4 KiB block size for all devices > > reporting a minimum I/O larger than 4 KiB, opting instead to query for > > its advertised minimum I/O value in the statx data struct. > > > > [1]: > > https://lore.kernel.org/all/20250204231209.429356-9-mcgrof@xxxxxxxxxx/ > > This isn't even upstream yet... > > > Signed-off-by: Daniel Gomez <da.gomez@xxxxxxxxxxx> > > --- > > Set MIN-IO from statx as the default filesystem fundamental block size. > > This ensures that, for devices reporting values within the supported > > XFS block size range, we do not incur in RMW. If the MIN-IO reported > > value is lower than the current default of 4 KiB, then 4 KiB will be > > used instead. > > I don't think this is a good idea -- assuming you mean the same MIN-IO > as what lsblk puts out: This is just about matching the values in code and documentation across all layers (to guarantee writes do no incur in RMW when possible and supported by the fs): minimum_io_size (block layer) -> stx_blksize (statx) -> lsblk MIN-IO (minimum I/ O size) -> Filesystem fundamental block size (mkfs.xfs -b size). * MIN-IO is the minimum I/O size in lsblk [1] which should be the queue-sysfs minimum_io_size [2] [3] ("This is the smallest preferred IO size reported by the device"). * From statx [4] manual (and kernel statx data struct description), stx_blksize is 'The "preferred" block size for efficient filesystem I/O (Writing to a file in smaller chunks may cause an inefficient read-modify-rewrite.)' [1] https://github.com/util-linux/util-linux/blob/master/misc-utils/lsblk.c#L199 [2] minimum_io_size: https://www.kernel.org/doc/Documentation/block/queue-sysfs.txt [3] https://www.kernel.org/doc/Documentation/ABI/stable/sysfs-block What: /sys/block/<disk>/queue/minimum_io_size Date: April 2009 Contact: Martin K. Petersen <martin.petersen@xxxxxxxxxx> Description: [RO] Storage devices may report a granularity or preferred minimum I/O size which is the smallest request the device can perform without incurring a performance penalty. For disk drives this is often the physical block size. For RAID arrays it is often the stripe chunk size. A properly aligned multiple of minimum_io_size is the preferred request size for workloads where a high number of I/O operations is desired. [4] https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/tree/man/man2/statx.2?id=master#n369 kernel: __u32 stx_blksize; /* Preferred general I/O size [uncond] */ > nvme1n1 512 > └─md0 524288 > └─node0.raid 524288 > └─node0_raid-storage 524288 Is the MIN-IO correctly reported in RAID arrays here? I guess it should match the stripe chunk size as per description above?