On Fri, Feb 23, 2024 at 03:59:58PM -0800, Luis Chamberlain wrote: > Part of the testing we have done with LBS was to do some performance > tests on XFS to ensure things are not regressing. Building linux is a > fine decent test and we did some random cloud instance tests on that and > presented that at Plumbers, but it doesn't really cut it if we want to > push things to the limit though. What are the limits to buffered IO > and how do we test that? Who keeps track of it? TLDR: Why does the pagecache suck? > ~86 GiB/s on pmem DIO on xfs with 64k block size, 1024 XFS agcount on x86_64 > Vs > ~ 7,000 MiB/s with buffered IO Profile? My guess is that you're bottlenecked on the xa_lock between memory reclaim removing folios from the page cache and the various threads adding folios to the page cache. If each thread has its own file, that would help. If the threads do their own reclaim that would help the page cache ... but then they'd contend on the node's lru lock instead, so just trading one pain for another.