Hello, kernel test robot noticed a -2.9% regression of will-it-scale.per_thread_ops on: commit: 0ede61d8589cc2d93aa78230d74ac58b5b8d0244 ("file: convert to SLAB_TYPESAFE_BY_RCU") https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master testcase: will-it-scale test machine: 224 threads 4 sockets Intel(R) Xeon(R) Platinum 8380H CPU @ 2.90GHz (Cooper Lake) with 192G memory parameters: nr_task: 16 mode: thread test: poll2 cpufreq_governor: performance If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <oliver.sang@xxxxxxxxx> | Closes: https://lore.kernel.org/oe-lkp/202311201406.2022ca3f-oliver.sang@xxxxxxxxx Details are as below: --------------------------------------------------------------------------------------------------> The kernel config and materials to reproduce are available at: https://download.01.org/0day-ci/archive/20231120/202311201406.2022ca3f-oliver.sang@xxxxxxxxx ========================================================================================= compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-12/performance/x86_64-rhel-8.3/thread/16/debian-11.1-x86_64-20220510.cgz/lkp-cpl-4sp2/poll2/will-it-scale commit: 93faf426e3 ("vfs: shave work on failed file open") 0ede61d858 ("file: convert to SLAB_TYPESAFE_BY_RCU") 93faf426e3cc000c 0ede61d8589cc2d93aa78230d74 ---------------- --------------------------- %stddev %change %stddev \ | \ 0.01 ± 9% +58125.6% 4.17 ±175% perf-sched.sch_delay.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread 89056 -2.0% 87309 proc-vmstat.nr_slab_unreclaimable 97958 ± 7% -9.7% 88449 ± 4% sched_debug.cpu.avg_idle.stddev 0.00 ± 12% +24.2% 0.00 ± 17% sched_debug.cpu.next_balance.stddev 6391048 -2.9% 6208584 will-it-scale.16.threads 399440 -2.9% 388036 will-it-scale.per_thread_ops 6391048 -2.9% 6208584 will-it-scale.workload 19.99 ± 4% -2.2 17.74 perf-profile.calltrace.cycles-pp.fput.do_poll.do_sys_poll.__x64_sys_poll.do_syscall_64 1.27 ± 5% +0.8 2.11 ± 3% perf-profile.calltrace.cycles-pp.__fdget.do_poll.do_sys_poll.__x64_sys_poll.do_syscall_64 32.69 ± 4% +5.0 37.70 perf-profile.calltrace.cycles-pp.__fget_light.do_poll.do_sys_poll.__x64_sys_poll.do_syscall_64 0.00 +27.9 27.85 perf-profile.calltrace.cycles-pp.__get_file_rcu.__fget_light.do_poll.do_sys_poll.__x64_sys_poll 20.00 ± 4% -2.3 17.75 perf-profile.children.cycles-pp.fput 0.24 ± 10% -0.1 0.18 ± 2% perf-profile.children.cycles-pp.syscall_return_via_sysret 1.48 ± 5% +0.5 1.98 ± 3% perf-profile.children.cycles-pp.__fdget 31.85 ± 4% +6.0 37.86 perf-profile.children.cycles-pp.__fget_light 0.00 +27.7 27.67 perf-profile.children.cycles-pp.__get_file_rcu 30.90 ± 4% -20.6 10.35 ± 2% perf-profile.self.cycles-pp.__fget_light 19.94 ± 4% -2.4 17.53 perf-profile.self.cycles-pp.fput 9.81 ± 4% -2.4 7.42 ± 2% perf-profile.self.cycles-pp.do_poll 0.23 ± 11% -0.1 0.17 ± 4% perf-profile.self.cycles-pp.syscall_return_via_sysret 0.00 +26.5 26.48 perf-profile.self.cycles-pp.__get_file_rcu 2.146e+10 ± 2% +8.5% 2.329e+10 ± 2% perf-stat.i.branch-instructions 0.22 ± 14% -0.0 0.19 ± 14% perf-stat.i.branch-miss-rate% 1.404e+10 ± 2% +8.7% 1.526e+10 ± 2% perf-stat.i.dTLB-stores 70.87 -2.3 68.59 perf-stat.i.iTLB-load-miss-rate% 5267608 -5.5% 4979133 ± 2% perf-stat.i.iTLB-load-misses 2102507 +5.4% 2215725 perf-stat.i.iTLB-loads 18791 ± 3% +10.5% 20757 ± 2% perf-stat.i.instructions-per-iTLB-miss 266.67 ± 2% +6.8% 284.75 ± 2% perf-stat.i.metric.M/sec 0.01 ± 10% -10.5% 0.01 ± 5% perf-stat.overall.MPKI 0.19 -0.0 0.17 perf-stat.overall.branch-miss-rate% 0.65 -3.1% 0.63 perf-stat.overall.cpi 0.00 ± 4% -0.0 0.00 ± 4% perf-stat.overall.dTLB-store-miss-rate% 71.48 -2.3 69.21 perf-stat.overall.iTLB-load-miss-rate% 18757 +10.0% 20629 perf-stat.overall.instructions-per-iTLB-miss 1.54 +3.2% 1.59 perf-stat.overall.ipc 4795147 +6.4% 5100406 perf-stat.overall.path-length 2.14e+10 ± 2% +8.5% 2.322e+10 ± 2% perf-stat.ps.branch-instructions 1.4e+10 ± 2% +8.7% 1.522e+10 ± 2% perf-stat.ps.dTLB-stores 5253923 -5.5% 4966218 ± 2% perf-stat.ps.iTLB-load-misses 2095770 +5.4% 2208605 perf-stat.ps.iTLB-loads 3.065e+13 +3.3% 3.167e+13 perf-stat.total.instructions Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki