Re: [linus:master] [migrate_pages] 7e12beb8ca: vm-scalability.throughput -3.4% regression

"Huang, Ying" <ying.huang@xxxxxxxxx> · Thu, 23 Mar 2023 09:53:23 +0800

"Liu, Yujie" <yujie.liu@xxxxxxxxx> writes:

> On Tue, 2023-03-21 at 13:43 +0800, Huang, Ying wrote:
>> "Liu, Yujie" <yujie.liu@xxxxxxxxx> writes:
>>
>> > Hi Ying,
>> >
>> > On Mon, 2023-03-20 at 15:58 +0800, Huang, Ying wrote:
>> > > Hi, Yujie,
>> > >
>> > > kernel test robot <yujie.liu@xxxxxxxxx> writes:
>> > >
>> > > > Hello,
>> > > >
>> > > > FYI, we noticed a -3.4% regression of vm-scalability.throughput due to commit:
>> > > >
>> > > > commit: 7e12beb8ca2ac98b2ec42e0ea4b76cdc93b58654 ("migrate_pages: batch flushing TLB")
>> > > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>> > > >
>> > > > in testcase: vm-scalability
>> > > > on test machine: 96 threads 2 sockets Intel(R) Xeon(R) Platinum 8260L CPU @ 2.40GHz (Cascade Lake) with 128G memory
>> > > > with following parameters:
>> > > >
>> > > >         runtime: 300s
>> > > >         size: 512G
>> > > >         test: anon-cow-rand-mt
>> > > >         cpufreq_governor: performance
>> > > >
>> > > > test-description: The motivation behind this suite is to exercise functions and regions of the mm/ of the Linux kernel which are of interest to us.
>> > > > test-url: https://git.kernel.org/cgit/linux/kernel/git/wfg/vm-scalability.git/
>> > > >
>> > > >
>> > > > If you fix the issue, kindly add following tag
>> > > > > Reported-by: kernel test robot <yujie.liu@xxxxxxxxx>
>> > > > > Link: https://lore.kernel.org/oe-lkp/202303192325.ecbaf968-yujie.liu@xxxxxxxxx
>> > > >
>> > >
>> > > Thanks a lot for report!  Can you try whether the debug patch as
>> > > below can restore the regression?
>> >
>> > We've tested the patch and found the throughput score was partially
>> > restored from -3.6% to -1.4%, still with a slight performance drop.
>> > Please check the detailed data as follows:
>>
>> Good!  Thanks for your detailed data!
>>
>> >       0.09 ± 17%      +1.2        1.32 ±  7%      +0.4        0.45 ± 21%  perf-profile.children.cycles-pp.flush_tlb_func
>>
>> It appears that we can reduce the unnecessary TLB flushing effectively
>> with the previous debug patch.  But the batched flush (full flush) is
>> still slower than the non-batched flush (flush one page).
>>
>> Can you try the debug patch as below to check whether it can restore the
>> regression completely?  The new debug patch can be applied on top of the
>> previous debug patch.
>
> The second debug patch got a -0.7% performance change. The data have
> some fluctuations from test to test, and the standard deviation is even
> a bit larger than 0.7%, which make the performance score not very
> convincing. Please check other metrics to see if the regression is
> fully restored. Thanks.

Thanks for testing!

>       0.09 ± 17%      +0.4        0.45 ± 21%      +0.0        0.09 ± 12%  perf-profile.children.cycles-pp.flush_tlb_func

>From the profiling data, the TLB flushing overhead has been restored.
So I think the remaining 0.7% regression should be at noise level.  I
will prepare the fixing patch based on the test results.

Best Regards,
Huang, Ying