Re: Regression in XFS for fsync heavy workload

Dave Chinner <david@xxxxxxxxxxxxx> · Wed, 16 Mar 2022 18:44:59 +1100

On Wed, Mar 16, 2022 at 12:06:27PM +1100, Dave Chinner wrote:
> On Tue, Mar 15, 2022 at 01:49:43PM +0100, Jan Kara wrote:
> > Hello,
> > 
> > I was tracking down a regression in dbench workload on XFS we have
> > identified during our performance testing. These are results from one of
> > our test machine (server with 64GB of RAM, 48 CPUs, SATA SSD for the test
> > disk):
> > 
> > 			       good		       bad
> > Amean     1        64.29 (   0.00%)       73.11 * -13.70%*
> > Amean     2        84.71 (   0.00%)       98.05 * -15.75%*
> > Amean     4       146.97 (   0.00%)      148.29 *  -0.90%*
> > Amean     8       252.94 (   0.00%)      254.91 *  -0.78%*
> > Amean     16      454.79 (   0.00%)      456.70 *  -0.42%*
> > Amean     32      858.84 (   0.00%)      857.74 (   0.13%)
> > Amean     64     1828.72 (   0.00%)     1865.99 *  -2.04%*
> > 
> > Note that the numbers are actually times to complete workload, not
> > traditional dbench throughput numbers so lower is better.
....

> > This should still
> > submit it rather early to provide the latency advantage. Otherwise postpone
> > the flush to the moment we know we are going to flush the iclog to save
> > pointless flushes. But we would have to record whether the flush happened
> > or not in the iclog and it would all get a bit hairy...
> 
> I think we can just set the NEED_FLUSH flag appropriately.
> 
> However, given all this, I'm wondering if the async cache flush was
> really a case of premature optimisation. That is, we don't really
> gain anything by reducing the flush latency of the first iclog write
> wehn we are writing 100-1000 iclogs before the commit record, and it
> can be harmful to some workloads by issuing more flushes than we
> need to.
> 
> So perhaps the right thing to do is just get rid of it and always
> mark the first iclog in a checkpoint as NEED_FLUSH....

So I've run some tests on code that does this, and the storage I've
tested it on shows largely no difference in stream CIL commit and
fsync heavy workloads when comparing synv vs as cache flushes. On
set of tests was against high speed NVMe ssds, the other against
old, slower SATA SSDs.

Jan, can you run the patch below (against 5.17-rc8) and see what
results you get on your modified dbench test?

Cheers,

Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx

xfs: drop async cache flushes from CIL commits.

From: Dave Chinner <dchinner@xxxxxxxxxx>

As discussed here:

https://lore.kernel.org/linux-xfs/20220316010627.GO3927073@xxxxxxxxxxxxxxxxxxx/T/#t

This is a prototype for removing async cache flushes from the CIL
checkpoint path. Fast NVME storage.