Hello, kernel test robot noticed a 4.2% improvement of aim9.creat-clo.ops_per_sec on: commit: e747e15156b79efeea0ad056df8de14b93d318c2 ("fs: try an opportunistic lookup for O_CREAT opens too") https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master testcase: aim9 config: x86_64-rhel-8.3 compiler: gcc-12 test machine: 48 threads 2 sockets Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz (Ivy Bridge-EP) with 64G memory parameters: testtime: 300s test: creat-clo cpufreq_governor: performance Details are as below: --------------------------------------------------------------------------------------------------> The kernel config and materials to reproduce are available at: https://download.01.org/0day-ci/archive/20241014/202410141350.a747ff5e-oliver.sang@xxxxxxxxx ========================================================================================= compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase/testtime: gcc-12/performance/x86_64-rhel-8.3/debian-12-x86_64-20240206.cgz/lkp-ivb-2ep2/creat-clo/aim9/300s commit: b9ca079dd6 ("eventpoll: Annotate data-race of busy_poll_usecs") e747e15156 ("fs: try an opportunistic lookup for O_CREAT opens too") b9ca079dd6b09e08 e747e15156b79efeea0ad056df8 ---------------- --------------------------- %stddev %change %stddev \ | \ 448590 +4.2% 467421 aim9.creat-clo.ops_per_sec 5868 ± 71% -99.7% 19.67 ± 79% proc-vmstat.numa_hint_faults 2929 ±112% -99.4% 17.33 ± 96% proc-vmstat.numa_pages_migrated 2929 ±112% -99.4% 17.33 ± 96% proc-vmstat.pgmigrate_success 0.04 ± 61% -100.0% 0.00 perf-sched.sch_delay.avg.ms.__cond_resched.mnt_want_write.open_last_lookups.path_openat.do_filp_open 0.09 ± 62% -100.0% 0.00 perf-sched.sch_delay.max.ms.__cond_resched.mnt_want_write.open_last_lookups.path_openat.do_filp_open 2.12 ± 44% +24071.1% 512.02 ±176% perf-sched.sch_delay.max.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm 0.04 ± 61% -100.0% 0.00 perf-sched.wait_time.avg.ms.__cond_resched.mnt_want_write.open_last_lookups.path_openat.do_filp_open 0.09 ± 62% -100.0% 0.00 perf-sched.wait_time.max.ms.__cond_resched.mnt_want_write.open_last_lookups.path_openat.do_filp_open 7.648e+08 -2.8% 7.43e+08 perf-stat.i.branch-instructions 1.60 +0.1 1.69 perf-stat.i.branch-miss-rate% 1.14 +2.6% 1.17 perf-stat.i.cpi 3.776e+09 -1.9% 3.706e+09 perf-stat.i.instructions 0.89 -2.6% 0.87 perf-stat.i.ipc 2.00 +0.1 2.10 perf-stat.overall.branch-miss-rate% 1.11 +2.4% 1.14 perf-stat.overall.cpi 0.90 -2.4% 0.88 perf-stat.overall.ipc 7.623e+08 -2.8% 7.406e+08 perf-stat.ps.branch-instructions 3.763e+09 -1.8% 3.694e+09 perf-stat.ps.instructions 1.135e+12 -1.9% 1.113e+12 perf-stat.total.instructions 2.34 ± 5% -1.7 0.69 ± 8% perf-profile.calltrace.cycles-pp.open_last_lookups.path_openat.do_filp_open.do_sys_openat2.__x64_sys_creat 23.22 -1.1 22.16 perf-profile.calltrace.cycles-pp.do_sys_openat2.__x64_sys_creat.do_syscall_64.entry_SYSCALL_64_after_hwframe.creat64 23.56 -1.1 22.49 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.creat64 23.68 -1.1 22.62 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.creat64 23.27 -1.1 22.21 perf-profile.calltrace.cycles-pp.__x64_sys_creat.do_syscall_64.entry_SYSCALL_64_after_hwframe.creat64 18.68 -0.8 17.84 perf-profile.calltrace.cycles-pp.path_openat.do_filp_open.do_sys_openat2.__x64_sys_creat.do_syscall_64 19.05 -0.8 18.26 perf-profile.calltrace.cycles-pp.do_filp_open.do_sys_openat2.__x64_sys_creat.do_syscall_64.entry_SYSCALL_64_after_hwframe 29.56 -0.7 28.81 perf-profile.calltrace.cycles-pp.creat64 0.86 ± 3% +0.0 0.90 ± 2% perf-profile.calltrace.cycles-pp.security_file_alloc.init_file.alloc_empty_file.path_openat.do_filp_open 1.29 +0.1 1.38 ± 4% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.creat64 1.01 ± 4% +0.1 1.12 ± 5% perf-profile.calltrace.cycles-pp.ima_file_check.security_file_post_open.do_open.path_openat.do_filp_open 1.07 ± 5% +0.1 1.18 ± 5% perf-profile.calltrace.cycles-pp.security_file_post_open.do_open.path_openat.do_filp_open.do_sys_openat2 1.53 ± 3% +0.1 1.65 ± 3% perf-profile.calltrace.cycles-pp.cap_inode_need_killpriv.security_inode_need_killpriv.dentry_needs_remove_privs.do_truncate.do_open 1.65 ± 3% +0.1 1.78 ± 3% perf-profile.calltrace.cycles-pp.security_inode_need_killpriv.dentry_needs_remove_privs.do_truncate.do_open.path_openat 0.71 ± 6% +0.1 0.84 ± 13% perf-profile.calltrace.cycles-pp.kmem_cache_free.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe.__close 1.72 ± 3% +0.1 1.86 ± 2% perf-profile.calltrace.cycles-pp.dentry_needs_remove_privs.do_truncate.do_open.path_openat.do_filp_open 1.32 ± 3% +0.1 1.46 ± 4% perf-profile.calltrace.cycles-pp.__vfs_getxattr.cap_inode_need_killpriv.security_inode_need_killpriv.dentry_needs_remove_privs.do_truncate 2.57 ± 6% +0.2 2.82 ± 4% perf-profile.calltrace.cycles-pp.alloc_empty_file.path_openat.do_filp_open.do_sys_openat2.__x64_sys_creat 1.32 ± 14% +0.2 1.57 ± 8% perf-profile.calltrace.cycles-pp.kmem_cache_alloc_noprof.alloc_empty_file.path_openat.do_filp_open.do_sys_openat2 0.74 ± 23% +0.3 1.02 ± 16% perf-profile.calltrace.cycles-pp.__memcg_slab_post_alloc_hook.kmem_cache_alloc_noprof.alloc_empty_file.path_openat.do_filp_open 11.00 +0.7 11.66 perf-profile.calltrace.cycles-pp.__close 10.48 +0.7 11.19 perf-profile.calltrace.cycles-pp.do_open.path_openat.do_filp_open.do_sys_openat2.__x64_sys_creat 2.39 ± 6% -1.7 0.72 ± 7% perf-profile.children.cycles-pp.open_last_lookups 23.33 -1.1 22.26 perf-profile.children.cycles-pp.do_sys_openat2 23.28 -1.1 22.22 perf-profile.children.cycles-pp.__x64_sys_creat 18.79 -0.8 17.95 perf-profile.children.cycles-pp.path_openat 19.13 -0.8 18.34 perf-profile.children.cycles-pp.do_filp_open 29.79 -0.8 29.04 perf-profile.children.cycles-pp.creat64 29.48 -0.8 28.72 perf-profile.children.cycles-pp.do_syscall_64 29.68 -0.7 28.97 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe 0.48 ± 5% -0.4 0.05 ± 48% perf-profile.children.cycles-pp.lookup_open 0.92 ± 9% -0.4 0.50 ± 7% perf-profile.children.cycles-pp.try_to_unlazy 0.78 ± 10% -0.4 0.40 ± 7% perf-profile.children.cycles-pp.dput 0.79 ± 7% -0.4 0.42 ± 7% perf-profile.children.cycles-pp.__legitimize_path 0.53 ± 13% -0.3 0.26 ± 10% perf-profile.children.cycles-pp.lockref_put_return 0.52 ± 10% -0.2 0.32 ± 14% perf-profile.children.cycles-pp.terminate_walk 0.39 ± 6% -0.2 0.20 ± 8% perf-profile.children.cycles-pp.__legitimize_mnt 3.25 ± 2% -0.2 3.07 ± 2% perf-profile.children.cycles-pp.notify_change 0.76 ± 6% -0.2 0.58 ± 9% perf-profile.children.cycles-pp._raw_spin_lock 0.45 ± 7% -0.2 0.28 ± 14% perf-profile.children.cycles-pp.mnt_want_write 0.52 ± 6% -0.2 0.35 ± 7% perf-profile.children.cycles-pp.security_inode_setattr 0.33 ± 6% -0.2 0.17 ± 15% perf-profile.children.cycles-pp.lockref_get_not_dead 0.32 ± 3% -0.2 0.17 ± 12% perf-profile.children.cycles-pp.down_write 0.60 ± 15% -0.2 0.45 ± 22% perf-profile.children.cycles-pp.step_into 0.31 ± 6% -0.1 0.16 ± 9% perf-profile.children.cycles-pp.up_write 0.45 ± 7% -0.1 0.31 ± 4% perf-profile.children.cycles-pp.mnt_get_write_access 0.48 ± 6% -0.1 0.36 ± 8% perf-profile.children.cycles-pp.__cond_resched 0.35 ± 8% -0.1 0.27 ± 5% perf-profile.children.cycles-pp.evm_inode_setattr 0.42 ± 9% -0.1 0.34 ± 13% perf-profile.children.cycles-pp.generic_permission 0.15 ± 12% -0.1 0.09 ± 12% perf-profile.children.cycles-pp.getname 0.20 ± 6% -0.1 0.14 ± 13% perf-profile.children.cycles-pp.rcu_all_qs 0.13 ± 13% -0.0 0.10 ± 17% perf-profile.children.cycles-pp.mntput_no_expire 0.09 ± 6% -0.0 0.07 ± 11% perf-profile.children.cycles-pp.can_stop_idle_tick 0.06 ± 47% +0.0 0.09 ± 10% perf-profile.children.cycles-pp.inode_newsize_ok 0.87 ± 2% +0.1 0.92 perf-profile.children.cycles-pp.security_file_alloc 0.03 ±100% +0.1 0.08 ± 11% perf-profile.children.cycles-pp.pm_qos_read_value 0.53 ± 5% +0.1 0.59 ± 3% perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack 0.19 ± 11% +0.1 0.27 ± 8% perf-profile.children.cycles-pp.setattr_prepare 0.34 ± 10% +0.1 0.42 ± 3% perf-profile.children.cycles-pp.simple_xattr_get 1.08 ± 5% +0.1 1.19 ± 5% perf-profile.children.cycles-pp.security_file_post_open 1.02 ± 5% +0.1 1.13 ± 5% perf-profile.children.cycles-pp.ima_file_check 1.55 ± 3% +0.1 1.67 ± 3% perf-profile.children.cycles-pp.cap_inode_need_killpriv 0.33 ± 11% +0.1 0.46 ± 24% perf-profile.children.cycles-pp.apparmor_file_open 1.67 ± 3% +0.1 1.80 ± 3% perf-profile.children.cycles-pp.security_inode_need_killpriv 0.37 ± 10% +0.1 0.50 ± 20% perf-profile.children.cycles-pp.security_file_open 0.35 ± 7% +0.1 0.50 ± 20% perf-profile.children.cycles-pp.security_current_getsecid_subj 1.74 ± 3% +0.1 1.88 ± 2% perf-profile.children.cycles-pp.dentry_needs_remove_privs 1.34 ± 3% +0.2 1.50 ± 4% perf-profile.children.cycles-pp.__vfs_getxattr 2.98 +0.2 3.17 perf-profile.children.cycles-pp.entry_SYSCALL_64 0.57 ± 7% +0.4 0.98 ± 6% perf-profile.children.cycles-pp.__d_lookup_rcu 0.63 ± 6% +0.4 1.08 ± 5% perf-profile.children.cycles-pp.lookup_fast 0.00 +0.5 0.54 ± 6% perf-profile.children.cycles-pp.complete_walk 11.22 +0.7 11.89 perf-profile.children.cycles-pp.__close 10.53 +0.7 11.23 perf-profile.children.cycles-pp.do_open 0.52 ± 12% -0.3 0.26 ± 10% perf-profile.self.cycles-pp.lockref_put_return 0.73 ± 7% -0.2 0.56 ± 10% perf-profile.self.cycles-pp._raw_spin_lock 0.38 ± 7% -0.2 0.20 ± 8% perf-profile.self.cycles-pp.__legitimize_mnt 0.33 ± 7% -0.2 0.17 ± 15% perf-profile.self.cycles-pp.lockref_get_not_dead 0.30 ± 6% -0.1 0.16 ± 9% perf-profile.self.cycles-pp.up_write 0.44 ± 7% -0.1 0.30 ± 3% perf-profile.self.cycles-pp.mnt_get_write_access 0.72 ± 6% -0.1 0.60 ± 8% perf-profile.self.cycles-pp.do_dentry_open 0.24 ± 5% -0.1 0.12 ± 11% perf-profile.self.cycles-pp.down_write 0.35 ± 11% -0.1 0.27 ± 13% perf-profile.self.cycles-pp.generic_permission 0.20 ± 21% -0.1 0.11 ± 9% perf-profile.self.cycles-pp.open_last_lookups 0.16 ± 9% -0.1 0.08 ± 19% perf-profile.self.cycles-pp.security_inode_setattr 0.16 ± 13% -0.1 0.09 ± 6% perf-profile.self.cycles-pp.getname_flags 0.27 ± 8% -0.1 0.20 ± 6% perf-profile.self.cycles-pp.evm_inode_setattr 0.14 ± 14% -0.1 0.08 ± 18% perf-profile.self.cycles-pp.getname 0.32 ± 2% -0.1 0.26 ± 9% perf-profile.self.cycles-pp.common_perm_cond 0.25 ± 5% -0.1 0.20 ± 8% perf-profile.self.cycles-pp.__cond_resched 0.17 ± 11% -0.1 0.12 ± 14% perf-profile.self.cycles-pp.rcu_all_qs 0.25 ± 9% -0.0 0.20 ± 10% perf-profile.self.cycles-pp.alloc_fd 0.12 ± 7% -0.0 0.08 ± 45% perf-profile.self.cycles-pp.shmem_file_open 0.13 ± 13% -0.0 0.10 ± 17% perf-profile.self.cycles-pp.mntput_no_expire 0.09 ± 6% -0.0 0.07 ± 11% perf-profile.self.cycles-pp.can_stop_idle_tick 0.11 ± 16% +0.0 0.15 ± 12% perf-profile.self.cycles-pp.lockref_get 0.03 ±100% +0.0 0.08 ± 14% perf-profile.self.cycles-pp.pm_qos_read_value 0.04 ± 72% +0.0 0.08 ± 8% perf-profile.self.cycles-pp.inode_newsize_ok 0.12 ± 17% +0.0 0.17 ± 9% perf-profile.self.cycles-pp.setattr_prepare 0.03 ±100% +0.1 0.09 ± 15% perf-profile.self.cycles-pp.lookup_fast 0.17 ± 13% +0.1 0.25 ± 9% perf-profile.self.cycles-pp.simple_xattr_get 0.26 ± 9% +0.1 0.40 ± 25% perf-profile.self.cycles-pp.apparmor_current_getsecid_subj 2.62 +0.2 2.81 perf-profile.self.cycles-pp.entry_SYSCALL_64 0.41 ± 15% +0.3 0.66 ± 15% perf-profile.self.cycles-pp.__memcg_slab_post_alloc_hook 0.57 ± 6% +0.4 0.97 ± 6% perf-profile.self.cycles-pp.__d_lookup_rcu Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki