Hello, kernel test robot noticed a 6.7% improvement of will-it-scale.per_thread_ops on: commit: 0c40bf47cf2d9e1413b1e62826c89c2341e66e40 ("fs/file.c: add fast path in find_next_fd()") https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master testcase: will-it-scale config: x86_64-rhel-8.3 compiler: gcc-12 test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 512G memory parameters: nr_task: 100% mode: thread test: dup1 cpufreq_governor: performance Details are as below: --------------------------------------------------------------------------------------------------> The kernel config and materials to reproduce are available at: https://download.01.org/0day-ci/archive/20241113/202411132104.c3e2d29f-oliver.sang@xxxxxxxxx ========================================================================================= compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-12/performance/x86_64-rhel-8.3/thread/100%/debian-12-x86_64-20240206.cgz/lkp-spr-2sp4/dup1/will-it-scale commit: c9a3019603 ("fs/file.c: conditionally clear full_fds") 0c40bf47cf ("fs/file.c: add fast path in find_next_fd()") c9a3019603b8a851 0c40bf47cf2d9e1413b1e62826c ---------------- --------------------------- %stddev %change %stddev \ | \ 33.83 ± 20% +27.1% 43.00 ± 12% perf-c2c.DRAM.local 0.42 ± 6% +26.9% 0.54 ± 8% perf-sched.sch_delay.avg.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open 0.29 ±118% -58.2% 0.12 ± 2% perf-sched.sch_delay.max.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm 0.42 ± 6% +26.9% 0.54 ± 8% perf-sched.wait_time.avg.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open 878341 ± 2% +6.7% 937496 ± 2% will-it-scale.224.threads 3920 ± 2% +6.7% 4184 ± 2% will-it-scale.per_thread_ops 878341 ± 2% +6.7% 937496 ± 2% will-it-scale.workload 0.06 +0.0 0.08 ± 6% perf-profile.children.cycles-pp.__fput_sync 0.00 +0.1 0.06 ± 6% perf-profile.children.cycles-pp.find_next_fd 0.06 +0.0 0.08 ± 6% perf-profile.self.cycles-pp.__fput_sync 0.09 +0.0 0.11 perf-profile.self.cycles-pp._raw_spin_lock 0.00 +0.1 0.05 perf-profile.self.cycles-pp.find_next_fd 0.18 ± 2% +11.5% 0.20 ± 3% perf-stat.i.MPKI 26.36 ± 3% +2.4 28.75 ± 4% perf-stat.i.cache-miss-rate% 8636784 ± 2% +11.9% 9662915 ± 3% perf-stat.i.cache-misses 76932 ± 2% -10.9% 68510 ± 4% perf-stat.i.cycles-between-cache-misses 0.17 ± 2% +11.8% 0.19 ± 3% perf-stat.overall.MPKI 24.92 ± 3% +2.2 27.16 ± 4% perf-stat.overall.cache-miss-rate% 74960 ± 2% -10.5% 67085 ± 4% perf-stat.overall.cycles-between-cache-misses 17263265 ± 2% -6.3% 16180844 ± 2% perf-stat.overall.path-length 8605013 ± 2% +11.9% 9625712 ± 3% perf-stat.ps.cache-misses Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki