Re: [PATCH v2 0/9] xfs: use large folios for buffers

Dave Chinner <david@xxxxxxxxxxxxx> · Tue, 19 Mar 2024 11:44:13 +1100

On Mon, Mar 18, 2024 at 05:24:10PM -0700, Christoph Hellwig wrote:
> On Tue, Mar 19, 2024 at 09:45:51AM +1100, Dave Chinner wrote:
> > Apart from those small complexities that are resolved by the end of
> > the patchset, the conversion and enhancement is relatively straight
> > forward.  It passes fstests on both 512 and 4096 byte sector size
> > storage (512 byte sectors exercise the XBF_KMEM path which has
> > non-zero bp->b_offset values) and doesn't appear to cause any
> > problems with large 64kB directory buffers on 4kB page machines.
> 
> Just curious, do you have any benchmark numbers to see if this actually
> improves performance?

I have run some fsmark scalability tests on 64kb directory block
sizes to check that nothing fails and the numbers are in the
expected ballpark, but I haven't done any specific back to back
performance regression testing.

The reason for that is two-fold:

1. scalability on 64kb directory buffer workloads is limited by
buffer lock latency and journal size. i.e. even a 2GB journal is
too small for high concurrency and results in significant amounts of
tail pushing and the directory modifications getting stuck on
writeback of directory buffers from tail-pushing.

2. relogging 64kB directory blocks is -expensive-. Comapred to a 4kB
block size, the large directory block sizes are relogged much more
frequently and the memcpy() in each relogging costs *much* more than
relogging a 4kB directory block. It also hits xlog_kvmalloc() really
hard, and that's now where we hit vmalloc scalalbility
issues on large dir block size workloads.

The result of these things is that there hasn't been any significant
change in performance one way or the other - what we gain in buffer
access efficiency, we give back in increased lock contention and
tail pushing latency issues...

-Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx