Re: [PATCH 0/3] Create large folios in iomap buffered write path

Dave Chinner <david@xxxxxxxxxxxxx> · Sun, 21 May 2023 12:49:18 +1000

On Sat, May 20, 2023 at 05:36:00PM +0100, Matthew Wilcox (Oracle) wrote:
> Wang Yugui has a workload which would be improved by using large folios.

I think that's a bit of a misrepresentation of the situation. The
workload in question has regressed from ~8GB/s to 2.5GB/s due to
page cache structure contention caused by XFS limiting writeback bio
chain length to bound worst case IO completion latency in 5.15. This
causes several tasks attempting concurrent exclusive locking of the
mapping tree: write(), writeback IO submission, writeback IO
completion and multiple memory reclaim tasks (both direct and
background). Limiting worse case latency means that IO completion is
accessing the mapping tree much more frequently (every 16MB, instead
of 16-32GB), and that has driven this workload into lock contention
breakdown.

This was shown in profiles indicating the regression was caused
by page cache contention causing excessive CPU usage in the
writeback flusher thread limiting IO submission rates. This is not
something we can fix in XFS - it's a exclusive lock access
issue in the page cache...

Mitigating the performance regression by enabling large folios for
XFS doesn't actually fix any of the locking problems, it just
reduces lock traffic in the IO path by a couple of orders of
magnitude. The problem will come back as bandwidth increases again.
Also, the same problem will affect other filesystems that aren't
capable of using large folios.

Hence I think we really need to document this problem with the
mitigations being proposed so that, in future, we know how to
recognise when we hit these page cache limitations again. i.e.
I think it would also be a good idea to include some of the
analysis that pointed to the page cache contention problem here
(either in the cover letter so it can be used as a merge tag
message, or in a commit), rather than present it as "here's a
performance improvement" without any context of what problem it's
actually mitigating....

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx