Hello, kernel test robot noticed a 1.9% improvement of stress-ng.dup.ops_per_sec on: commit: c69ff4071935f946f1cddc59e1d36a03442ed015 ("filelock: split leases out of struct file_lock") https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master testcase: stress-ng test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory parameters: nr_threads: 100% disk: 1HDD testtime: 60s fs: ext4 test: dup cpufreq_governor: performance Details are as below: --------------------------------------------------------------------------------------------------> The kernel config and materials to reproduce are available at: https://download.01.org/0day-ci/archive/20240403/202404031033.c2d3b356-oliver.sang@xxxxxxxxx ========================================================================================= compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: gcc-12/performance/1HDD/ext4/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp8/dup/stress-ng/60s commit: 282c30f320 ("filelock: remove temporary compatibility macros") c69ff40719 ("filelock: split leases out of struct file_lock") 282c30f320ba2579 c69ff4071935f946f1cddc59e1d ---------------- --------------------------- %stddev %change %stddev \ | \ 195388 +2.0% 199324 vmstat.system.cs 1502041 +1.9% 1531046 stress-ng.dup.ops 25032 +1.9% 25516 stress-ng.dup.ops_per_sec 2020 -1.9% 1982 stress-ng.time.system_time 176.48 +11.1% 196.06 stress-ng.time.user_time 3992532 +1.8% 4063489 stress-ng.time.voluntary_context_switches 1.949e+10 +2.3% 1.994e+10 perf-stat.i.branch-instructions 1.51 -3.2% 1.46 perf-stat.i.cpi 9.495e+10 +2.3% 9.711e+10 perf-stat.i.instructions 0.67 +3.6% 0.70 perf-stat.i.ipc 1.51 -3.4% 1.46 perf-stat.overall.cpi 0.66 +3.5% 0.69 perf-stat.overall.ipc 198601 +1.9% 202371 perf-stat.ps.context-switches 16.89 -3.1 13.75 perf-profile.calltrace.cycles-pp.filp_flush.filp_close.put_files_struct.do_exit.do_group_exit 24.02 -2.8 21.19 perf-profile.calltrace.cycles-pp.filp_close.put_files_struct.do_exit.do_group_exit.__x64_sys_exit_group 12.92 -2.7 10.25 perf-profile.calltrace.cycles-pp.locks_remove_posix.filp_flush.filp_close.put_files_struct.do_exit 33.85 -2.5 31.32 perf-profile.calltrace.cycles-pp.put_files_struct.do_exit.do_group_exit.__x64_sys_exit_group.do_syscall_64 53.34 -1.8 51.51 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe 53.34 -1.8 51.51 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe 53.31 -1.8 51.47 perf-profile.calltrace.cycles-pp.__x64_sys_exit_group.do_syscall_64.entry_SYSCALL_64_after_hwframe 53.31 -1.8 51.47 perf-profile.calltrace.cycles-pp.do_group_exit.__x64_sys_exit_group.do_syscall_64.entry_SYSCALL_64_after_hwframe 54.16 -1.8 52.34 perf-profile.calltrace.cycles-pp.do_exit.do_group_exit.__x64_sys_exit_group.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.62 +0.0 0.64 perf-profile.calltrace.cycles-pp.do_dentry_open.do_open.path_openat.do_filp_open.do_sys_openat2 0.74 +0.0 0.77 perf-profile.calltrace.cycles-pp.acct_collect.do_exit.do_group_exit.__x64_sys_exit_group.do_syscall_64 0.86 +0.0 0.89 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe._exit 0.86 +0.0 0.89 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe._exit 0.60 +0.0 0.63 perf-profile.calltrace.cycles-pp.rwsem_spin_on_owner.rwsem_optimistic_spin.rwsem_down_write_slowpath.down_write.anon_vma_clone 0.86 +0.0 0.88 perf-profile.calltrace.cycles-pp.do_group_exit.__x64_sys_exit_group.do_syscall_64.entry_SYSCALL_64_after_hwframe._exit 0.86 +0.0 0.89 perf-profile.calltrace.cycles-pp._exit 0.68 +0.0 0.72 perf-profile.calltrace.cycles-pp.do_open.path_openat.do_filp_open.do_sys_openat2.__x64_sys_openat 0.82 +0.0 0.86 perf-profile.calltrace.cycles-pp.up_write.free_pgtables.exit_mmap.__mmput.exit_mm 0.70 ± 2% +0.0 0.74 perf-profile.calltrace.cycles-pp.kmem_cache_alloc.anon_vma_clone.anon_vma_fork.dup_mmap.dup_mm 0.81 +0.0 0.85 perf-profile.calltrace.cycles-pp.do_wait.kernel_wait4.__do_sys_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.59 +0.0 0.63 ± 5% perf-profile.calltrace.cycles-pp.__libc_fork 0.85 +0.0 0.89 perf-profile.calltrace.cycles-pp.__do_sys_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe.wait4 0.85 +0.0 0.88 perf-profile.calltrace.cycles-pp.kernel_wait4.__do_sys_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe.wait4 0.65 +0.0 0.69 perf-profile.calltrace.cycles-pp.rwsem_spin_on_owner.rwsem_optimistic_spin.rwsem_down_write_slowpath.down_write.anon_vma_fork 1.09 +0.0 1.14 perf-profile.calltrace.cycles-pp.kmem_cache_free.exit_mmap.__mmput.exit_mm.do_exit 0.95 +0.0 1.00 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.wait4 0.96 +0.0 1.00 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.wait4 0.94 +0.0 0.98 perf-profile.calltrace.cycles-pp.rwsem_optimistic_spin.rwsem_down_write_slowpath.down_write.anon_vma_clone.anon_vma_fork 0.98 +0.0 1.02 perf-profile.calltrace.cycles-pp.wait4 0.55 +0.0 0.60 ± 7% perf-profile.calltrace.cycles-pp.stress_dup 0.98 +0.0 1.03 perf-profile.calltrace.cycles-pp.rwsem_down_write_slowpath.down_write.anon_vma_clone.anon_vma_fork.dup_mmap 1.44 +0.0 1.48 perf-profile.calltrace.cycles-pp.do_filp_open.do_sys_openat2.__x64_sys_openat.do_syscall_64.entry_SYSCALL_64_after_hwframe 1.42 +0.0 1.47 perf-profile.calltrace.cycles-pp.path_openat.do_filp_open.do_sys_openat2.__x64_sys_openat.do_syscall_64 1.86 +0.1 1.91 perf-profile.calltrace.cycles-pp.__x64_sys_openat.do_syscall_64.entry_SYSCALL_64_after_hwframe.open64 1.85 +0.1 1.90 perf-profile.calltrace.cycles-pp.do_sys_openat2.__x64_sys_openat.do_syscall_64.entry_SYSCALL_64_after_hwframe.open64 1.18 +0.1 1.24 perf-profile.calltrace.cycles-pp.__do_sys_clone.do_syscall_64.entry_SYSCALL_64_after_hwframe.__clone 1.18 +0.1 1.24 perf-profile.calltrace.cycles-pp.kernel_clone.__do_sys_clone.do_syscall_64.entry_SYSCALL_64_after_hwframe.__clone 1.18 +0.1 1.24 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__clone 1.18 +0.1 1.24 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__clone 1.51 +0.1 1.57 perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify 1.74 +0.1 1.80 perf-profile.calltrace.cycles-pp.kmem_cache_alloc.vm_area_dup.dup_mmap.dup_mm.copy_process 1.60 +0.1 1.67 perf-profile.calltrace.cycles-pp.kmem_cache_free.unlink_anon_vmas.free_pgtables.exit_mmap.__mmput 1.83 +0.1 1.89 perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64_no_verify 1.86 +0.1 1.93 perf-profile.calltrace.cycles-pp.secondary_startup_64_no_verify 1.82 +0.1 1.89 perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify 1.52 +0.1 1.59 perf-profile.calltrace.cycles-pp.__clone 1.82 +0.1 1.89 perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify 1.28 +0.1 1.36 perf-profile.calltrace.cycles-pp.rwsem_down_write_slowpath.down_write.anon_vma_fork.dup_mmap.dup_mm 1.36 +0.1 1.44 perf-profile.calltrace.cycles-pp.down_write.anon_vma_fork.dup_mmap.dup_mm.copy_process 1.22 +0.1 1.30 perf-profile.calltrace.cycles-pp.rwsem_optimistic_spin.rwsem_down_write_slowpath.down_write.anon_vma_fork.dup_mmap 2.38 +0.1 2.48 perf-profile.calltrace.cycles-pp.vm_area_dup.dup_mmap.dup_mm.copy_process.kernel_clone 3.34 +0.2 3.51 perf-profile.calltrace.cycles-pp.anon_vma_interval_tree_insert.anon_vma_clone.anon_vma_fork.dup_mmap.dup_mm 5.14 +0.2 5.37 perf-profile.calltrace.cycles-pp.unlink_anon_vmas.free_pgtables.exit_mmap.__mmput.exit_mm 6.76 +0.3 7.04 perf-profile.calltrace.cycles-pp.anon_vma_clone.anon_vma_fork.dup_mmap.dup_mm.copy_process 5.87 +0.3 6.16 perf-profile.calltrace.cycles-pp.fput.filp_close.put_files_struct.do_exit.do_group_exit 7.49 +0.3 7.82 perf-profile.calltrace.cycles-pp.free_pgtables.exit_mmap.__mmput.exit_mm.do_exit 9.30 +0.4 9.69 perf-profile.calltrace.cycles-pp.anon_vma_fork.dup_mmap.dup_mm.copy_process.kernel_clone 0.10 ±200% +0.4 0.52 perf-profile.calltrace.cycles-pp.fifo_open.do_dentry_open.do_open.path_openat.do_filp_open 7.40 +0.5 7.86 perf-profile.calltrace.cycles-pp.dup_fd.copy_process.kernel_clone.__do_sys_clone.do_syscall_64 0.05 ±299% +0.5 0.52 ± 2% perf-profile.calltrace.cycles-pp.osq_lock.rwsem_optimistic_spin.rwsem_down_write_slowpath.down_write.anon_vma_fork 17.56 +0.6 18.18 perf-profile.calltrace.cycles-pp.exit_mmap.__mmput.exit_mm.do_exit.do_group_exit 17.62 +0.6 18.24 perf-profile.calltrace.cycles-pp.__mmput.exit_mm.do_exit.do_group_exit.__x64_sys_exit_group 17.63 +0.6 18.25 perf-profile.calltrace.cycles-pp.exit_mm.do_exit.do_group_exit.__x64_sys_exit_group.do_syscall_64 19.31 +0.7 20.03 perf-profile.calltrace.cycles-pp.dup_mmap.dup_mm.copy_process.kernel_clone.__do_sys_clone 19.75 +0.7 20.48 perf-profile.calltrace.cycles-pp.dup_mm.copy_process.kernel_clone.__do_sys_clone.do_syscall_64 28.58 +1.3 29.84 perf-profile.calltrace.cycles-pp.__do_sys_clone.do_syscall_64.entry_SYSCALL_64_after_hwframe._Fork 28.58 +1.3 29.84 perf-profile.calltrace.cycles-pp.kernel_clone.__do_sys_clone.do_syscall_64.entry_SYSCALL_64_after_hwframe._Fork 28.59 +1.3 29.85 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe._Fork 28.59 +1.3 29.86 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe._Fork 29.12 +1.3 30.41 perf-profile.calltrace.cycles-pp._Fork 29.02 +1.3 30.31 perf-profile.calltrace.cycles-pp.copy_process.kernel_clone.__do_sys_clone.do_syscall_64.entry_SYSCALL_64_after_hwframe 17.87 -3.2 14.72 perf-profile.children.cycles-pp.filp_flush 25.00 -2.9 22.12 perf-profile.children.cycles-pp.filp_close 13.27 -2.7 10.58 perf-profile.children.cycles-pp.locks_remove_posix 34.49 -2.5 31.99 perf-profile.children.cycles-pp.put_files_struct 54.16 -1.8 52.36 perf-profile.children.cycles-pp.__x64_sys_exit_group 54.16 -1.8 52.35 perf-profile.children.cycles-pp.do_exit 54.17 -1.8 52.36 perf-profile.children.cycles-pp.do_group_exit 88.90 -0.4 88.52 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe 88.88 -0.4 88.50 perf-profile.children.cycles-pp.do_syscall_64 0.46 ± 2% +0.0 0.48 perf-profile.children.cycles-pp.asm_sysvec_call_function_single 0.24 ± 2% +0.0 0.26 ± 4% perf-profile.children.cycles-pp.memcg_account_kmem 0.51 +0.0 0.53 perf-profile.children.cycles-pp.find_idlest_cpu 0.61 +0.0 0.63 perf-profile.children.cycles-pp.irq_exit_rcu 0.66 +0.0 0.69 perf-profile.children.cycles-pp.mas_next_slot 0.44 +0.0 0.46 perf-profile.children.cycles-pp.___slab_alloc 0.36 +0.0 0.38 perf-profile.children.cycles-pp.mm_init 0.75 +0.0 0.78 perf-profile.children.cycles-pp.acct_collect 0.47 +0.0 0.49 perf-profile.children.cycles-pp.dup_userfaultfd 0.49 +0.0 0.52 perf-profile.children.cycles-pp.fifo_open 0.55 +0.0 0.58 perf-profile.children.cycles-pp.lock_vma_under_rcu 0.78 +0.0 0.80 perf-profile.children.cycles-pp.rcu_do_batch 0.18 ± 3% +0.0 0.21 ± 17% perf-profile.children.cycles-pp.process_one_work 0.70 +0.0 0.73 perf-profile.children.cycles-pp.wake_up_new_task 0.84 +0.0 0.86 perf-profile.children.cycles-pp.mas_find 0.87 +0.0 0.90 perf-profile.children.cycles-pp._exit 0.62 +0.0 0.65 perf-profile.children.cycles-pp.do_dentry_open 0.85 +0.0 0.88 perf-profile.children.cycles-pp.load_balance 0.69 +0.0 0.72 perf-profile.children.cycles-pp.do_open 0.44 ± 2% +0.0 0.48 perf-profile.children.cycles-pp.__pte_offset_map_lock 0.92 +0.0 0.95 perf-profile.children.cycles-pp.newidle_balance 1.00 +0.0 1.03 perf-profile.children.cycles-pp.pick_next_task_fair 0.85 +0.0 0.89 perf-profile.children.cycles-pp.__do_sys_wait4 0.85 +0.0 0.88 perf-profile.children.cycles-pp.kernel_wait4 0.98 +0.0 1.02 perf-profile.children.cycles-pp.wait4 1.05 ± 2% +0.0 1.09 perf-profile.children.cycles-pp.__vm_area_free 0.65 +0.0 0.69 ± 5% perf-profile.children.cycles-pp.__libc_fork 0.97 +0.0 1.01 perf-profile.children.cycles-pp.schedule 1.43 +0.0 1.47 perf-profile.children.cycles-pp.path_openat 1.44 +0.0 1.49 perf-profile.children.cycles-pp.do_filp_open 0.45 ± 2% +0.0 0.50 ± 3% perf-profile.children.cycles-pp.memset_orig 0.61 +0.0 0.66 ± 8% perf-profile.children.cycles-pp.stress_dup 1.13 +0.1 1.18 perf-profile.children.cycles-pp.do_wait 0.57 ± 2% +0.1 0.62 ± 6% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath 1.86 +0.1 1.92 perf-profile.children.cycles-pp.__x64_sys_openat 1.86 +0.1 1.91 perf-profile.children.cycles-pp.do_sys_openat2 1.06 +0.1 1.12 ± 3% perf-profile.children.cycles-pp.ret_from_fork_asm 1.54 +0.1 1.60 perf-profile.children.cycles-pp.cpuidle_idle_call 1.68 +0.1 1.74 perf-profile.children.cycles-pp.__schedule 1.83 +0.1 1.89 perf-profile.children.cycles-pp.start_secondary 1.86 +0.1 1.93 perf-profile.children.cycles-pp.cpu_startup_entry 1.86 +0.1 1.93 perf-profile.children.cycles-pp.secondary_startup_64_no_verify 1.77 +0.1 1.84 perf-profile.children.cycles-pp.__slab_free 1.85 +0.1 1.92 perf-profile.children.cycles-pp.do_idle 1.54 +0.1 1.61 perf-profile.children.cycles-pp.__clone 1.09 ± 2% +0.1 1.17 ± 2% perf-profile.children.cycles-pp.__anon_vma_interval_tree_remove 1.23 +0.1 1.31 ± 2% perf-profile.children.cycles-pp._raw_spin_lock 2.51 +0.1 2.59 perf-profile.children.cycles-pp.up_write 2.44 ± 2% +0.1 2.53 perf-profile.children.cycles-pp.__memcg_slab_free_hook 2.40 +0.1 2.50 perf-profile.children.cycles-pp.vm_area_dup 1.71 +0.1 1.81 perf-profile.children.cycles-pp.rwsem_spin_on_owner 3.53 +0.1 3.65 perf-profile.children.cycles-pp.kmem_cache_alloc 3.37 +0.2 3.53 perf-profile.children.cycles-pp.anon_vma_interval_tree_insert 3.03 +0.2 3.20 perf-profile.children.cycles-pp.rwsem_optimistic_spin 3.21 +0.2 3.38 perf-profile.children.cycles-pp.rwsem_down_write_slowpath 4.73 +0.2 4.91 perf-profile.children.cycles-pp.kmem_cache_free 5.16 +0.2 5.39 perf-profile.children.cycles-pp.unlink_anon_vmas 5.23 +0.2 5.46 perf-profile.children.cycles-pp.down_write 6.77 +0.3 7.04 perf-profile.children.cycles-pp.anon_vma_clone 6.56 +0.3 6.84 perf-profile.children.cycles-pp.fput 7.51 +0.3 7.84 perf-profile.children.cycles-pp.free_pgtables 9.32 +0.4 9.71 perf-profile.children.cycles-pp.anon_vma_fork 7.40 +0.5 7.86 perf-profile.children.cycles-pp.dup_fd 17.58 +0.6 18.20 perf-profile.children.cycles-pp.exit_mmap 17.67 +0.6 18.29 perf-profile.children.cycles-pp.exit_mm 17.62 +0.6 18.24 perf-profile.children.cycles-pp.__mmput 19.36 +0.7 20.08 perf-profile.children.cycles-pp.dup_mmap 19.75 +0.7 20.48 perf-profile.children.cycles-pp.dup_mm 29.18 +1.3 30.47 perf-profile.children.cycles-pp._Fork 29.03 +1.3 30.32 perf-profile.children.cycles-pp.copy_process 29.76 +1.3 31.08 perf-profile.children.cycles-pp.__do_sys_clone 29.76 +1.3 31.08 perf-profile.children.cycles-pp.kernel_clone 12.85 -2.7 10.18 perf-profile.self.cycles-pp.locks_remove_posix 3.16 -0.4 2.80 perf-profile.self.cycles-pp.filp_flush 0.72 +0.0 0.74 perf-profile.self.cycles-pp.kmem_cache_alloc 0.50 +0.0 0.53 perf-profile.self.cycles-pp.kmem_cache_free 0.81 +0.0 0.85 perf-profile.self.cycles-pp._raw_spin_lock 0.44 ± 2% +0.0 0.49 ± 3% perf-profile.self.cycles-pp.memset_orig 0.57 ± 2% +0.1 0.62 ± 6% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath 1.70 +0.1 1.76 perf-profile.self.cycles-pp.__slab_free 2.46 +0.1 2.54 perf-profile.self.cycles-pp.up_write 1.70 +0.1 1.79 perf-profile.self.cycles-pp.rwsem_spin_on_owner 3.33 +0.2 3.50 perf-profile.self.cycles-pp.anon_vma_interval_tree_insert 6.13 +0.3 6.44 perf-profile.self.cycles-pp.fput 7.02 +0.4 7.43 perf-profile.self.cycles-pp.dup_fd 7.28 +0.6 7.85 perf-profile.self.cycles-pp.put_files_struct Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki