Kefeng Wang <wangkefeng.wang@xxxxxxxxxx> writes: [snip] > >>> 1) Will test some rand test to check the different of performance as >>> David suggested. >>> >>> 2) Hope the LKP to run more tests since it is very useful(more test >>> set and different machines) >> I'm starting to use LKP to test. > > Greet. I have run some tests with LKP to test. Firstly, there's almost no measurable difference between clearing pages from start to end or from end to start on Intel server CPU. I guess that there's some similar optimization for both direction. For multiple processes (same as logical CPU number) vm-scalability/anon-w-seq test case, the benchmark score increases about 22.4%. For multiple processes vm-scalability/anon-w-rand test case, no measurable difference for benchmark score. So, the optimization helps sequential workload mainly. In summary, on x86, process_huge_page() will not introduce any regression. And it helps some workload. However, on ARM64, it does introduce some regression for clearing pages from end to start. That needs to be addressed. I guess that the regression can be resolved via using more clearing from start to end (but not all). For example, can you take a look at the patch below? Which uses the similar framework as before, but clear each small trunk (mpage) from start to end. You can adjust MPAGE_NRPAGES to check when the regression can be restored. WARNING: the patch is only build tested. Best Regards, Huang, Ying -----------------------------------8<----------------------------------------