Hello, kernel test robot noticed a -4.8% regression of stress-ng.seek.ops_per_sec on: commit: 3d04e89a11244255549fe838e9c6126e0e64729b ("file: don't optimize for f_count equal 1") https://git.kernel.org/cgit/linux/kernel/git/vfs/vfs.git vfs.fdget_pos testcase: stress-ng test machine: 36 threads 1 sockets Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz (Cascade Lake) with 128G memory parameters: nr_threads: 1 disk: 1HDD testtime: 60s fs: ext4 class: os test: seek cpufreq_governor: performance If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <oliver.sang@xxxxxxxxx> | Closes: https://lore.kernel.org/oe-lkp/202308141149.d38fdf91-oliver.sang@xxxxxxxxx Details are as below: --------------------------------------------------------------------------------------------------> The kernel config and materials to reproduce are available at: https://download.01.org/0day-ci/archive/20230814/202308141149.d38fdf91-oliver.sang@xxxxxxxxx ========================================================================================= class/compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: os/gcc-12/performance/1HDD/ext4/x86_64-rhel-8.3/1/debian-11.1-x86_64-20220510.cgz/lkp-csl-d02/seek/stress-ng/60s commit: v6.5-rc1 3d04e89a11 ("file: don't optimize for f_count equal 1") v6.5-rc1 3d04e89a11244255549fe838e9c ---------------- --------------------------- %stddev %change %stddev \ | \ 2.35 +2.6% 2.41 iostat.cpu.system 1.16 -6.2% 1.08 iostat.cpu.user 163.48 +5.2% 172.02 stress-ng.seek.nanosecs_per_seek 22830283 -4.8% 21730866 stress-ng.seek.ops 380493 -4.8% 362170 stress-ng.seek.ops_per_sec 1.61 -3.8% 1.55 ± 2% perf-stat.i.MPKI 1.32 ± 4% -0.1 1.20 ± 3% perf-stat.i.branch-miss-rate% 16447081 ± 4% -7.0% 15288768 ± 2% perf-stat.i.branch-misses 9058578 -3.9% 8705166 perf-stat.i.cache-references 8.886e+08 +1.2% 8.989e+08 perf-stat.i.dTLB-stores 10392601 ± 6% -12.3% 9113713 ± 3% perf-stat.i.iTLB-load-misses 753.71 ± 4% +9.6% 826.21 ± 4% perf-stat.i.instructions-per-iTLB-miss 275.76 -3.1% 267.08 perf-stat.i.metric.K/sec 1.61 -3.7% 1.55 perf-stat.overall.MPKI 1.43 ± 4% -0.1 1.32 ± 2% perf-stat.overall.branch-miss-rate% 544.45 ± 6% +13.4% 617.59 ± 4% perf-stat.overall.instructions-per-iTLB-miss 16185644 ± 4% -7.0% 15046357 ± 2% perf-stat.ps.branch-misses 8914660 -3.9% 8567102 perf-stat.ps.cache-references 8.745e+08 +1.2% 8.846e+08 perf-stat.ps.dTLB-stores 10227957 ± 6% -12.3% 8969124 ± 3% perf-stat.ps.iTLB-load-misses 15.97 ± 5% -2.1 13.85 ± 4% perf-profile.calltrace.cycles-pp.ext4_llseek.ksys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek 1.43 ± 7% -0.6 0.80 ± 12% perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek 1.12 ± 7% -0.3 0.80 ± 12% perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.llseek 0.00 +1.7 1.66 ± 6% perf-profile.calltrace.cycles-pp.mutex_unlock.ksys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek 0.00 +2.3 2.30 ± 10% perf-profile.calltrace.cycles-pp.mutex_lock.__fdget_pos.ksys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe 19.90 ± 4% +2.4 22.28 ± 6% perf-profile.calltrace.cycles-pp.ksys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek 2.41 ± 5% +2.5 4.92 ± 10% perf-profile.calltrace.cycles-pp.__fdget_pos.ksys_lseek.do_syscall_64.entry_SYSCALL_64_after_hwframe.llseek 16.09 ± 4% -2.1 13.97 ± 4% perf-profile.children.cycles-pp.ext4_llseek 1.84 ± 4% -0.6 1.22 ± 10% perf-profile.children.cycles-pp.syscall_exit_to_user_mode 1.40 ± 5% -0.3 1.10 ± 7% perf-profile.children.cycles-pp.syscall_return_via_sysret 0.66 ± 3% -0.2 0.45 ± 14% perf-profile.children.cycles-pp.exit_to_user_mode_prepare 1.78 ± 7% -0.2 1.57 ± 7% perf-profile.children.cycles-pp.iomap_iter_advance 0.51 ± 11% -0.2 0.34 ± 14% perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare 0.00 +0.1 0.12 ± 22% perf-profile.children.cycles-pp.__x64_sys_read 0.00 +0.1 0.15 ± 25% perf-profile.children.cycles-pp.__f_unlock_pos 0.28 ± 13% +0.2 0.50 ± 17% perf-profile.children.cycles-pp.rcu_all_qs 0.56 ± 9% +0.5 1.09 ± 12% perf-profile.children.cycles-pp.__cond_resched 0.00 +2.0 1.99 ± 6% perf-profile.children.cycles-pp.mutex_unlock 20.08 ± 5% +2.4 22.49 ± 6% perf-profile.children.cycles-pp.ksys_lseek 0.00 +2.7 2.70 ± 9% perf-profile.children.cycles-pp.mutex_lock 3.19 ± 5% +2.8 6.02 ± 9% perf-profile.children.cycles-pp.__fdget_pos 3.39 ± 5% -1.9 1.53 ± 2% perf-profile.self.cycles-pp.ext4_llseek 1.31 ± 6% -0.3 1.00 ± 7% perf-profile.self.cycles-pp.syscall_return_via_sysret 0.74 ± 5% -0.2 0.49 ± 14% perf-profile.self.cycles-pp.syscall_exit_to_user_mode 0.57 ± 5% -0.2 0.35 ± 16% perf-profile.self.cycles-pp.exit_to_user_mode_prepare 0.46 ± 15% -0.2 0.25 ± 10% perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare 1.76 ± 6% -0.2 1.55 ± 6% perf-profile.self.cycles-pp.iomap_iter_advance 0.00 +0.1 0.12 ± 24% perf-profile.self.cycles-pp.__x64_sys_read 0.22 ± 16% +0.2 0.37 ± 20% perf-profile.self.cycles-pp.rcu_all_qs 0.33 ± 9% +0.3 0.67 ± 12% perf-profile.self.cycles-pp.__cond_resched 0.00 +2.0 1.97 ± 6% perf-profile.self.cycles-pp.mutex_unlock 0.00 +2.1 2.08 ± 10% perf-profile.self.cycles-pp.mutex_lock Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki