On 2024/6/20 15:38, Huang, Ying wrote:
Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx> writes:
On 2024/6/20 10:39, kernel test robot wrote:
Hello,
kernel test robot noticed a -7.1% regression of
vm-scalability.throughput on:
commit: d2136d749d76af980b3accd72704eea4eab625bd ("mm: support
multi-size THP numa balancing")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
[still regression on linus/master
92e5605a199efbaee59fb19e15d6cc2103a04ec2]
testcase: vm-scalability
test machine: 128 threads 2 sockets Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz (Ice Lake) with 256G memory
parameters:
runtime: 300s
size: 512G
test: anon-cow-rand-hugetlb
cpufreq_governor: performance
Thanks for reporting. IIUC numa balancing will not scan hugetlb VMA,
I'm not sure how this patch affects the performance of hugetlb cow,
but let me try to reproduce it.
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@xxxxxxxxx>
| Closes: https://lore.kernel.org/oe-lkp/202406201010.a1344783-oliver.sang@xxxxxxxxx
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240620/202406201010.a1344783-oliver.sang@xxxxxxxxx
=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/runtime/size/tbox_group/test/testcase:
gcc-13/performance/x86_64-rhel-8.3/debian-12-x86_64-20240206.cgz/300s/512G/lkp-icl-2sp2/anon-cow-rand-hugetlb/vm-scalability
commit:
6b0ed7b3c7 ("mm: factor out the numa mapping rebuilding into a new helper")
d2136d749d ("mm: support multi-size THP numa balancing")
6b0ed7b3c77547d2 d2136d749d76af980b3accd7270
---------------- ---------------------------
%stddev %change %stddev
\ | \
12.02 -1.3 10.72 ± 4% mpstat.cpu.all.sys%
1228757 +3.0% 1265679 proc-vmstat.pgfault
Also from other proc-vmstat stats,
21770 36% +6.1% 23098 28% proc-vmstat.numa_hint_faults
6168 107% +48.8% 9180 18% proc-vmstat.numa_hint_faults_local
154537 15% +23.5% 190883 17% proc-vmstat.numa_pte_updates
After your patch, more hint page faults occurs, I think this is expected.
Then, tasks may be moved between sockets because of that, so that some
hugetlb page access becomes remote?
After trying to reproduce this case, I also find that more hint page
faults occur. And I think that is casued by changing
"folio_ref_count(folio) != 1" to "folio_likely_mapped_shared(folio)",
which results in scanning more exclusive pages, so I think this is
expected from the previous discussion.
Yes, I think your analysis is correct, some hugetlb page accesses become
remote due to task migration.