Hello, kernel test robot noticed a -3.7% regression of will-it-scale.per_process_ops on: commit: 5886fc82b6e3166dd1ba876809888fc39028d626 ("mm/slub: attempt to find layouts up to 1/2 waste in calculate_order()") https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master testcase: will-it-scale test machine: 224 threads 4 sockets Intel(R) Xeon(R) Platinum 8380H CPU @ 2.90GHz (Cooper Lake) with 192G memory parameters: nr_task: 50% mode: process test: poll2 cpufreq_governor: performance If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <oliver.sang@xxxxxxxxx> | Closes: https://lore.kernel.org/oe-lkp/202310202221.fdbcbe56-oliver.sang@xxxxxxxxx Details are as below: --------------------------------------------------------------------------------------------------> The kernel config and materials to reproduce are available at: https://download.01.org/0day-ci/archive/20231020/202310202221.fdbcbe56-oliver.sang@xxxxxxxxx ========================================================================================= compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-12/performance/x86_64-rhel-8.3/process/50%/debian-11.1-x86_64-20220510.cgz/lkp-cpl-4sp2/poll2/will-it-scale commit: 0fe2735d5e ("mm/slub: remove min_objects loop from calculate_order()") 5886fc82b6 ("mm/slub: attempt to find layouts up to 1/2 waste in calculate_order()") 0fe2735d5e2e0060 5886fc82b6e3166dd1ba8768098 ---------------- --------------------------- %stddev %change %stddev \ | \ 28.08 +1.1% 28.40 boot-time.dhcp 6.17 ± 10% -15.4% 5.22 ± 10% perf-sched.wait_and_delay.avg.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity 6.17 ± 10% -15.4% 5.22 ± 10% perf-sched.wait_time.avg.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity 98376568 -3.7% 94713387 will-it-scale.112.processes 878361 -3.7% 845654 will-it-scale.per_process_ops 98376568 -3.7% 94713387 will-it-scale.workload 81444 +4.8% 85370 proc-vmstat.nr_active_anon 85071 +4.8% 89137 proc-vmstat.nr_shmem 81444 +4.8% 85370 proc-vmstat.nr_zone_active_anon 79205 +3.8% 82205 proc-vmstat.pgactivate 5.18 -0.4 4.79 ± 2% perf-profile.calltrace.cycles-pp.__fdget.do_poll.do_sys_poll.__x64_sys_poll.do_syscall_64 2.18 -0.2 2.03 ± 2% perf-profile.calltrace.cycles-pp.__check_object_size.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe 2.29 -0.1 2.19 perf-profile.calltrace.cycles-pp.__entry_text_start.__poll 0.83 -0.1 0.76 ± 3% perf-profile.calltrace.cycles-pp.__check_heap_object.__check_object_size.do_sys_poll.__x64_sys_poll.do_syscall_64 0.90 -0.1 0.84 ± 2% perf-profile.calltrace.cycles-pp.check_heap_object.__check_object_size.do_sys_poll.__x64_sys_poll.do_syscall_64 0.66 ± 2% -0.1 0.61 ± 2% perf-profile.calltrace.cycles-pp.__virt_addr_valid.check_heap_object.__check_object_size.do_sys_poll.__x64_sys_poll 0.66 -0.0 0.61 ± 3% perf-profile.calltrace.cycles-pp.kfree.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe 47.75 +1.3 49.07 perf-profile.calltrace.cycles-pp.do_poll.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe 22.63 +2.1 24.74 perf-profile.calltrace.cycles-pp.__fget_light.do_poll.do_sys_poll.__x64_sys_poll.do_syscall_64 5.17 -0.4 4.78 ± 2% perf-profile.children.cycles-pp.__fdget 2.35 -0.2 2.18 ± 2% perf-profile.children.cycles-pp.__check_object_size 0.84 -0.1 0.77 ± 3% perf-profile.children.cycles-pp.__check_heap_object 1.48 -0.1 1.41 perf-profile.children.cycles-pp.__entry_text_start 0.94 -0.1 0.87 ± 2% perf-profile.children.cycles-pp.check_heap_object 1.57 -0.1 1.51 ± 2% perf-profile.children.cycles-pp.__kmalloc 0.68 ± 2% -0.1 0.63 perf-profile.children.cycles-pp.__virt_addr_valid 0.66 -0.0 0.61 ± 3% perf-profile.children.cycles-pp.kfree 0.83 -0.0 0.79 ± 2% perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack 22.29 +1.7 24.01 perf-profile.children.cycles-pp.__fget_light 48.12 +1.7 49.84 perf-profile.children.cycles-pp.do_poll 7.66 -0.4 7.22 perf-profile.self.cycles-pp.do_sys_poll 2.58 ± 2% -0.2 2.38 ± 2% perf-profile.self.cycles-pp.__fdget 2.23 -0.1 2.12 ± 2% perf-profile.self.cycles-pp._copy_from_user 1.07 ± 3% -0.1 0.98 ± 2% perf-profile.self.cycles-pp.__poll 0.84 -0.1 0.77 ± 2% perf-profile.self.cycles-pp.__check_heap_object 0.66 ± 2% -0.1 0.61 ± 2% perf-profile.self.cycles-pp.__virt_addr_valid 0.65 -0.0 0.61 ± 3% perf-profile.self.cycles-pp.kfree 0.80 -0.0 0.76 ± 2% perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack 0.67 ± 2% -0.0 0.64 perf-profile.self.cycles-pp.__entry_text_start 19.62 +1.9 21.54 perf-profile.self.cycles-pp.__fget_light 2.225e+11 -3.7% 2.143e+11 perf-stat.i.branch-instructions 5.573e+08 -3.2% 5.393e+08 perf-stat.i.branch-misses 2332742 ± 2% -6.6% 2179079 perf-stat.i.cache-misses 13799351 -3.9% 13256775 perf-stat.i.cache-references 0.32 +5.0% 0.34 perf-stat.i.cpi 3.863e+11 +1.2% 3.908e+11 perf-stat.i.cpu-cycles 174616 ± 3% +9.1% 190529 ± 2% perf-stat.i.cycles-between-cache-misses 2.777e+11 -3.7% 2.675e+11 perf-stat.i.dTLB-loads 1.689e+11 -3.7% 1.627e+11 perf-stat.i.dTLB-stores 50719249 -2.8% 49295350 perf-stat.i.iTLB-load-misses 2674672 -14.5% 2285560 perf-stat.i.iTLB-loads 1.206e+12 -3.7% 1.161e+12 perf-stat.i.instructions 3.12 -4.8% 2.97 perf-stat.i.ipc 1.24 -4.0% 1.19 perf-stat.i.metric.G/sec 1.72 +1.1% 1.74 perf-stat.i.metric.GHz 76.66 -5.6% 72.34 perf-stat.i.metric.K/sec 1743 -3.5% 1683 perf-stat.i.metric.M/sec 594324 -2.9% 576831 perf-stat.i.node-load-misses 0.32 +5.0% 0.34 perf-stat.overall.cpi 165074 ± 2% +8.2% 178683 perf-stat.overall.cycles-between-cache-misses 3.12 -4.8% 2.97 perf-stat.overall.ipc 2.217e+11 -3.7% 2.135e+11 perf-stat.ps.branch-instructions 5.554e+08 -3.2% 5.375e+08 perf-stat.ps.branch-misses 2333651 ± 2% -6.6% 2179985 perf-stat.ps.cache-misses 13948192 -3.9% 13410551 perf-stat.ps.cache-references 3.849e+11 +1.2% 3.894e+11 perf-stat.ps.cpu-cycles 2.767e+11 -3.7% 2.665e+11 perf-stat.ps.dTLB-loads 1.683e+11 -3.7% 1.621e+11 perf-stat.ps.dTLB-stores 50558427 -2.8% 49131845 perf-stat.ps.iTLB-load-misses 2664632 -14.5% 2276961 ± 2% perf-stat.ps.iTLB-loads 1.201e+12 -3.7% 1.157e+12 perf-stat.ps.instructions 592459 -2.9% 575320 perf-stat.ps.node-load-misses 3.621e+14 -3.6% 3.492e+14 perf-stat.total.instructions Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki