On Thu, Mar 20, 2025 at 03:54:49PM +0100, Christoph Hellwig wrote: > On Thu, Mar 20, 2025 at 02:47:22PM +0100, Daniel Gomez wrote: > > On Thu, Mar 20, 2025 at 04:41:11AM +0100, Luis Chamberlain wrote: > > > We've been constrained to a max single 512 KiB IO for a while now on x86_64. > > > This is due to the number of DMA segments and the segment size. With LBS the > > > segments can be much bigger without using huge pages, and so on a 64 KiB > > > block size filesystem you can now see 2 MiB IOs when using buffered IO. > > > > Actually up to 8 MiB I/O with 64k filesystem block size with buffered I/O > > as we can describe up to 128 segments at 64k size. > > Block layer segments are in no way limited to the logical block size. You are right but that was not what I meant. I'll use a 16 KiB fs example as with 64 KiB you hit the current NVMe 8 MiB driver limit (NVME_MAX_KB_SZ): "on a 16 KiB block size filesystem, using buffered I/O will always allow at least 2 MiB I/O, though higher I/O may be possible". And yes, we can do 8 MiB I/O with direct I/O as well. It's just not reliable unless huge pages are used. The maximum reliable supported I/O size is 512 KiB. With buffered I/O, a larger fs block size guarantees a specific upper limit, i.e 2 MiB for 16 KiB, 4 MiB for 32 KiB and 8 MiB for 64 KiB.