On Wed, Mar 16, 2022 at 08:38:40PM +0100, Jan Kara wrote: > On Wed 16-03-22 11:09:34, Jan Kara wrote: > > On Wed 16-03-22 18:44:59, Dave Chinner wrote: > > > On Wed, Mar 16, 2022 at 12:06:27PM +1100, Dave Chinner wrote: > > > > On Tue, Mar 15, 2022 at 01:49:43PM +0100, Jan Kara wrote: > > > > > Hello, > > > > > > > > > > I was tracking down a regression in dbench workload on XFS we have > > > > > identified during our performance testing. These are results from one of > > > > > our test machine (server with 64GB of RAM, 48 CPUs, SATA SSD for the test > > > > > disk): > > > > > > > > > > good bad > > > > > Amean 1 64.29 ( 0.00%) 73.11 * -13.70%* > > > > > Amean 2 84.71 ( 0.00%) 98.05 * -15.75%* > > > > > Amean 4 146.97 ( 0.00%) 148.29 * -0.90%* > > > > > Amean 8 252.94 ( 0.00%) 254.91 * -0.78%* > > > > > Amean 16 454.79 ( 0.00%) 456.70 * -0.42%* > > > > > Amean 32 858.84 ( 0.00%) 857.74 ( 0.13%) > > > > > Amean 64 1828.72 ( 0.00%) 1865.99 * -2.04%* > > > > > > > > > > Note that the numbers are actually times to complete workload, not > > > > > traditional dbench throughput numbers so lower is better. > > > .... > > > > > > > > This should still > > > > > submit it rather early to provide the latency advantage. Otherwise postpone > > > > > the flush to the moment we know we are going to flush the iclog to save > > > > > pointless flushes. But we would have to record whether the flush happened > > > > > or not in the iclog and it would all get a bit hairy... > > > > > > > > I think we can just set the NEED_FLUSH flag appropriately. > > > > > > > > However, given all this, I'm wondering if the async cache flush was > > > > really a case of premature optimisation. That is, we don't really > > > > gain anything by reducing the flush latency of the first iclog write > > > > wehn we are writing 100-1000 iclogs before the commit record, and it > > > > can be harmful to some workloads by issuing more flushes than we > > > > need to. > > > > > > > > So perhaps the right thing to do is just get rid of it and always > > > > mark the first iclog in a checkpoint as NEED_FLUSH.... > > > > > > So I've run some tests on code that does this, and the storage I've > > > tested it on shows largely no difference in stream CIL commit and > > > fsync heavy workloads when comparing synv vs as cache flushes. On > > > set of tests was against high speed NVMe ssds, the other against > > > old, slower SATA SSDs. > > > > > > Jan, can you run the patch below (against 5.17-rc8) and see what > > > results you get on your modified dbench test? > > > > Sure, I'll run the test. I forgot to mention that in vanilla upstream kernel > > I could see the difference in the number of cache flushes caused by the > > XFS changes but not actual change in dbench numbers (they were still > > comparable to the bad ones from my test). The XFS change made material > > difference to dbench performance only together with scheduler / cpuidling / > > frequency scaling fixes we have in our SLE kernel (I didn't try to pin down > > which exactly - I guess I can try working around that by using performance > > cpufreq governor and disabling low cstates so that I can test stock > > vanilla kernels). Thanks for the patch! > > Yup, so with limiting cstates and performance cpufreq governor I can see > your patch helps significantly the dbench performance: > > 5.18-rc8-vanilla 5.18-rc8-patched > Amean 1 71.22 ( 0.00%) 64.94 * 8.81%* > Amean 2 93.03 ( 0.00%) 84.80 * 8.85%* > Amean 4 150.54 ( 0.00%) 137.51 * 8.66%* > Amean 8 252.53 ( 0.00%) 242.24 * 4.08%* > Amean 16 454.13 ( 0.00%) 439.08 * 3.31%* > Amean 32 835.24 ( 0.00%) 829.74 * 0.66%* > Amean 64 1740.59 ( 0.00%) 1686.73 * 3.09%* > > The performance is restored to values before commit bad77c375e8d ("xfs: CIL > checkpoint flushes caches unconditionally") as well as the number of > flushes. OK, good to know, thanks for testing quickly. I'll spin this up into a proper patch that removes the async flush functionality and support infrastructure. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx