On Thu, Feb 06, 2025 at 08:30:22PM +0100, Christoph Hellwig wrote: > On Thu, Feb 06, 2025 at 07:00:55PM +0000, da.gomez@xxxxxxxxxx wrote: > > From: Daniel Gomez <da.gomez@xxxxxxxxxxx> > > > > In patch [1] ("bdev: use bdev_io_min() for statx block size"), block > > devices will now report their preferred minimum I/O size for optimal > > performance in the stx_blksize field of the statx data structure. This > > change updates the current default 4 KiB block size for all devices > > reporting a minimum I/O larger than 4 KiB, opting instead to query for > > its advertised minimum I/O value in the statx data struct. > > UUuh, no. Larger block sizes have their use cases, but this will > regress performance for a lot (most?) common setups. A lot of > device report fairly high values there, but say increasing the Are these devices reporting the correct value? As I mentioned in my discussion with Darrick, matching the minimum_io_size with the "fs fundamental blocksize" actually allows to avoid RMW operations (when using the default path in mkfs.xfs and the value reported is within boundaries). > directory and bmap btree block size unconditionally using the current > algorithms will dramatically increase write amplification. Similarly I agree, but it seems to be a consequence of using such a large minimum_io_size. > for small buffered writes. > Exactly. Even though write amplification happens with a 512 byte minimum_io_size and a 4k default block size, it doesn't incur a performance penalty. But we will incur that when minimum_io_size exceeds 4k. So, this solves the issue but comes at the cost of write amplification.