Hello, kernel test robot noticed a -11.2% regression of stress-ng.file-ioctl.ops_per_sec on: commit: dfad37051ade6ac0d404ef4913f3bd01954ee51c ("remap_range: move permission hooks out of do_clone_file_range()") https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master testcase: stress-ng test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory parameters: nr_threads: 10% disk: 1HDD testtime: 60s fs: btrfs test: file-ioctl cpufreq_governor: performance If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <oliver.sang@xxxxxxxxx> | Closes: https://lore.kernel.org/oe-lkp/202401312229.eddeb9a6-oliver.sang@xxxxxxxxx Details are as below: --------------------------------------------------------------------------------------------------> The kernel config and materials to reproduce are available at: https://download.01.org/0day-ci/archive/20240131/202401312229.eddeb9a6-oliver.sang@xxxxxxxxx ========================================================================================= compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: gcc-12/performance/1HDD/btrfs/x86_64-rhel-8.3/10%/debian-11.1-x86_64-20220510.cgz/lkp-icl-2sp8/file-ioctl/stress-ng/60s commit: d53471ba6f ("splice: remove permission hook from iter_file_splice_write()") dfad37051a ("remap_range: move permission hooks out of do_clone_file_range()") d53471ba6f7ae97a dfad37051ade6ac0d404ef4913f ---------------- --------------------------- %stddev %change %stddev \ | \ 2.57 -0.3 2.27 mpstat.cpu.all.usr% 7.40 +3.4% 7.65 iostat.cpu.system 2.50 -11.5% 2.22 iostat.cpu.user 95739218 -11.2% 84990543 ± 2% stress-ng.file-ioctl.ops 1595650 -11.2% 1416506 ± 2% stress-ng.file-ioctl.ops_per_sec 267.41 +4.2% 278.66 stress-ng.time.system_time 90.19 -12.5% 78.96 stress-ng.time.user_time 0.12 ± 9% +37.6% 0.16 ± 3% perf-stat.i.MPKI 5.619e+09 -4.9% 5.346e+09 perf-stat.i.branch-instructions 25.26 ± 12% +5.4 30.67 ± 2% perf-stat.i.cache-miss-rate% 3226271 ± 8% +32.3% 4268159 ± 2% perf-stat.i.cache-misses 13880671 ± 2% +7.6% 14934433 perf-stat.i.cache-references 0.83 +3.9% 0.86 perf-stat.i.cpi 7405 ± 8% -26.1% 5473 ± 2% perf-stat.i.cycles-between-cache-misses 5.186e+09 -6.0% 4.873e+09 perf-stat.i.dTLB-stores 2.807e+10 -3.9% 2.696e+10 perf-stat.i.instructions 1.21 -3.7% 1.17 perf-stat.i.ipc 257.16 +12.9% 290.46 perf-stat.i.metric.K/sec 290.80 -4.2% 278.45 perf-stat.i.metric.M/sec 1580051 ± 11% +38.0% 2180479 ± 5% perf-stat.i.node-load-misses 228848 ± 22% +116.2% 494834 ± 27% perf-stat.i.node-loads 0.11 ± 9% +37.7% 0.16 ± 3% perf-stat.overall.MPKI 23.29 ± 11% +5.3 28.58 ± 2% perf-stat.overall.cache-miss-rate% 0.82 +3.9% 0.86 perf-stat.overall.cpi 7231 ± 8% -25.1% 5416 ± 2% perf-stat.overall.cycles-between-cache-misses 1.21 -3.7% 1.17 perf-stat.overall.ipc 5.524e+09 -4.8% 5.257e+09 perf-stat.ps.branch-instructions 3170718 ± 8% +32.4% 4196610 ± 2% perf-stat.ps.cache-misses 13646445 ± 2% +7.6% 14686495 ± 2% perf-stat.ps.cache-references 5.099e+09 -6.0% 4.792e+09 perf-stat.ps.dTLB-stores 2.759e+10 -3.9% 2.651e+10 perf-stat.ps.instructions 1553350 ± 11% +38.1% 2144498 ± 5% perf-stat.ps.node-load-misses 224907 ± 22% +116.2% 486304 ± 27% perf-stat.ps.node-loads 1.668e+12 -3.4% 1.611e+12 ± 2% perf-stat.total.instructions 5.57 ± 3% -0.7 4.85 ± 2% perf-profile.calltrace.cycles-pp.__fget_light.__x64_sys_ioctl.do_syscall_64.entry_SYSCALL_64_after_hwframe.ioctl 0.89 ± 23% -0.4 0.45 ± 44% perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.ioctl 2.30 ± 2% -0.3 2.00 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe 1.69 ± 3% -0.3 1.39 ± 4% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64 1.99 ± 2% -0.3 1.72 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe 1.16 ± 3% -0.2 1.00 ± 3% perf-profile.calltrace.cycles-pp.__x64_sys_fcntl.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.60 ± 4% -0.2 0.44 ± 45% perf-profile.calltrace.cycles-pp.__fget_light.__x64_sys_fcntl.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.00 +1.5 1.52 ± 2% perf-profile.calltrace.cycles-pp.__fsnotify_parent.vfs_clone_file_range.ioctl_file_clone.do_vfs_ioctl.__x64_sys_ioctl 0.00 +6.9 6.94 ± 6% perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.vfs_clone_file_range.ioctl_file_clone.do_vfs_ioctl 0.00 +7.4 7.41 ± 6% perf-profile.calltrace.cycles-pp.security_file_permission.vfs_clone_file_range.ioctl_file_clone.do_vfs_ioctl.__x64_sys_ioctl 21.11 +7.4 28.53 perf-profile.calltrace.cycles-pp.do_vfs_ioctl.__x64_sys_ioctl.do_syscall_64.entry_SYSCALL_64_after_hwframe.ioctl 3.18 ± 2% +8.7 11.87 ± 3% perf-profile.calltrace.cycles-pp.ioctl_file_clone.do_vfs_ioctl.__x64_sys_ioctl.do_syscall_64.entry_SYSCALL_64_after_hwframe 1.46 ± 9% +8.9 10.36 ± 4% perf-profile.calltrace.cycles-pp.vfs_clone_file_range.ioctl_file_clone.do_vfs_ioctl.__x64_sys_ioctl.do_syscall_64 10.70 -1.3 9.39 ± 3% perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack 11.31 -1.1 10.24 ± 2% perf-profile.children.cycles-pp.entry_SYSCALL_64 7.87 ± 3% -1.0 6.90 perf-profile.children.cycles-pp.__fget_light 5.13 -0.7 4.46 ± 2% perf-profile.children.cycles-pp.syscall_exit_to_user_mode 0.89 -0.4 0.46 ± 5% perf-profile.children.cycles-pp.do_clone_file_range 3.45 ± 2% -0.4 3.10 perf-profile.children.cycles-pp.llseek 1.80 ± 4% -0.3 1.49 ± 3% perf-profile.children.cycles-pp.stress_file_ioctl 1.83 -0.2 1.63 ± 4% perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack 1.53 ± 3% -0.2 1.34 ± 4% perf-profile.children.cycles-pp.exit_to_user_mode_prepare 2.32 ± 3% -0.2 2.13 perf-profile.children.cycles-pp.syscall_return_via_sysret 1.58 ± 2% -0.2 1.40 perf-profile.children.cycles-pp.memdup_user 1.81 -0.2 1.62 perf-profile.children.cycles-pp.__get_user_4 1.26 ± 3% -0.2 1.08 ± 3% perf-profile.children.cycles-pp.__x64_sys_fcntl 1.32 ± 2% -0.2 1.14 ± 2% perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare 2.06 ± 2% -0.2 1.90 ± 3% perf-profile.children.cycles-pp.syscall_enter_from_user_mode 1.12 ± 3% -0.1 0.99 ± 2% perf-profile.children.cycles-pp.security_file_ioctl 0.84 ± 3% -0.1 0.73 ± 3% perf-profile.children.cycles-pp.ksys_lseek 0.29 ± 4% -0.1 0.18 ± 4% perf-profile.children.cycles-pp.generic_file_rw_checks 0.76 ± 3% -0.1 0.68 perf-profile.children.cycles-pp.amd_clear_divider 0.84 ± 3% -0.1 0.75 ± 3% perf-profile.children.cycles-pp.__put_user_4 0.86 ± 4% -0.1 0.78 ± 3% perf-profile.children.cycles-pp._raw_spin_lock 0.53 ± 3% -0.1 0.46 ± 4% perf-profile.children.cycles-pp.__fdget_pos 0.19 ± 11% -0.1 0.12 ± 10% perf-profile.children.cycles-pp.stress_mwc8 0.54 ± 5% -0.1 0.48 ± 6% perf-profile.children.cycles-pp.__check_object_size 0.73 ± 2% -0.1 0.67 ± 5% perf-profile.children.cycles-pp.__fdget 0.49 ± 2% -0.1 0.43 ± 3% perf-profile.children.cycles-pp.__kmalloc_node_track_caller 0.51 ± 4% -0.1 0.45 ± 5% perf-profile.children.cycles-pp.ioctl@plt 0.58 ± 3% -0.0 0.54 ± 4% perf-profile.children.cycles-pp.__get_user_2 0.38 ± 3% -0.0 0.33 ± 4% perf-profile.children.cycles-pp.__kmem_cache_alloc_node 0.44 ± 3% -0.0 0.40 ± 3% perf-profile.children.cycles-pp.__libc_fcntl64 0.24 ± 6% -0.0 0.20 ± 7% perf-profile.children.cycles-pp.do_fcntl 0.48 ± 3% -0.0 0.44 ± 2% perf-profile.children.cycles-pp.set_close_on_exec 0.16 ± 8% -0.0 0.14 ± 8% perf-profile.children.cycles-pp.__check_heap_object 0.00 +0.2 0.25 ± 4% perf-profile.children.cycles-pp.fsnotify_perm 0.57 +0.6 1.15 ± 3% perf-profile.children.cycles-pp.aa_file_perm 85.52 +1.4 86.91 perf-profile.children.cycles-pp.ioctl 0.00 +1.6 1.55 perf-profile.children.cycles-pp.__fsnotify_parent 62.60 +4.0 66.55 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe 59.77 +4.3 64.05 perf-profile.children.cycles-pp.do_syscall_64 47.98 +5.7 53.66 perf-profile.children.cycles-pp.__x64_sys_ioctl 21.64 +7.3 28.98 perf-profile.children.cycles-pp.do_vfs_ioctl 8.29 ± 4% +7.4 15.74 ± 6% perf-profile.children.cycles-pp.apparmor_file_permission 8.78 ± 4% +7.9 16.64 ± 5% perf-profile.children.cycles-pp.security_file_permission 3.30 ± 2% +8.7 11.96 ± 3% perf-profile.children.cycles-pp.ioctl_file_clone 1.68 +8.9 10.55 ± 3% perf-profile.children.cycles-pp.vfs_clone_file_range 10.33 -1.3 9.02 ± 3% perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack 11.15 -1.2 9.92 ± 2% perf-profile.self.cycles-pp.ioctl 7.55 ± 3% -0.9 6.61 perf-profile.self.cycles-pp.__fget_light 3.16 ± 4% -0.5 2.69 ± 2% perf-profile.self.cycles-pp.do_vfs_ioctl 2.95 ± 2% -0.4 2.55 ± 2% perf-profile.self.cycles-pp.__x64_sys_ioctl 3.32 -0.4 2.93 ± 2% perf-profile.self.cycles-pp.do_syscall_64 3.08 ± 2% -0.4 2.72 ± 3% perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe 3.13 -0.4 2.78 ± 2% perf-profile.self.cycles-pp.entry_SYSCALL_64 2.39 ± 2% -0.3 2.10 ± 2% perf-profile.self.cycles-pp.ioctl_preallocate 0.57 ± 2% -0.3 0.31 ± 9% perf-profile.self.cycles-pp.do_clone_file_range 2.02 ± 2% -0.3 1.77 ± 3% perf-profile.self.cycles-pp.syscall_exit_to_user_mode 1.54 ± 4% -0.2 1.29 ± 3% perf-profile.self.cycles-pp.stress_file_ioctl 1.83 -0.2 1.62 ± 4% perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack 2.32 ± 3% -0.2 2.13 perf-profile.self.cycles-pp.syscall_return_via_sysret 1.77 -0.2 1.58 perf-profile.self.cycles-pp.__get_user_4 1.28 ± 2% -0.2 1.11 ± 4% perf-profile.self.cycles-pp.exit_to_user_mode_prepare 1.76 ± 2% -0.1 1.62 ± 3% perf-profile.self.cycles-pp.syscall_enter_from_user_mode 0.25 ± 6% -0.1 0.12 ± 8% perf-profile.self.cycles-pp.generic_file_rw_checks 0.48 ± 2% -0.1 0.38 ± 4% perf-profile.self.cycles-pp.ioctl_file_clone 0.79 ± 3% -0.1 0.70 ± 2% perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare 0.81 ± 3% -0.1 0.73 ± 4% perf-profile.self.cycles-pp.__put_user_4 0.81 ± 5% -0.1 0.73 ± 3% perf-profile.self.cycles-pp._raw_spin_lock 0.52 ± 4% -0.1 0.44 ± 3% perf-profile.self.cycles-pp.amd_clear_divider 0.17 ± 11% -0.1 0.12 ± 10% perf-profile.self.cycles-pp.stress_mwc8 0.57 ± 3% -0.0 0.52 ± 4% perf-profile.self.cycles-pp.__get_user_2 0.42 ± 4% -0.0 0.38 ± 3% perf-profile.self.cycles-pp.__libc_fcntl64 0.30 ± 3% -0.0 0.26 ± 5% perf-profile.self.cycles-pp.__x64_sys_fcntl 0.22 ± 5% -0.0 0.18 ± 6% perf-profile.self.cycles-pp.do_fcntl 0.28 ± 3% -0.0 0.24 ± 2% perf-profile.self.cycles-pp.__kmem_cache_alloc_node 0.00 +0.2 0.22 ± 4% perf-profile.self.cycles-pp.fsnotify_perm 0.49 ± 3% +0.4 0.92 ± 2% perf-profile.self.cycles-pp.security_file_permission 0.46 ± 2% +0.5 0.96 ± 2% perf-profile.self.cycles-pp.aa_file_perm 0.00 +1.5 1.52 ± 2% perf-profile.self.cycles-pp.__fsnotify_parent 7.75 ± 4% +6.8 14.58 ± 7% perf-profile.self.cycles-pp.apparmor_file_permission Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki