Hello, kernel test robot noticed a 6.8% improvement of stress-ng.access.access_calls_per_sec on: commit: 2865baf54077aa98fcdb478cefe6a42c417b9374 ("x86: support user address masking instead of non-speculative conditional") https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master testcase: stress-ng config: x86_64-rhel-8.3 compiler: gcc-12 test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory parameters: nr_threads: 100% disk: 1HDD testtime: 60s fs: btrfs test: access cpufreq_governor: performance Details are as below: --------------------------------------------------------------------------------------------------> The kernel config and materials to reproduce are available at: https://download.01.org/0day-ci/archive/20241016/202410161557.5b87225e-oliver.sang@xxxxxxxxx ========================================================================================= compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: gcc-12/performance/1HDD/btrfs/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp8/access/stress-ng/60s commit: v6.10 2865baf540 ("x86: support user address masking instead of non-speculative conditional") v6.10 2865baf54077aa98fcdb478cefe ---------------- --------------------------- %stddev %change %stddev \ | \ 1008 ± 35% -45.4% 550.53 ± 74% numa-meminfo.node0.Inactive(file) 100.41 ± 55% -63.1% 37.01 ± 70% perf-sched.wait_and_delay.max.ms.__cond_resched.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write 3373715 +6.8% 3603928 stress-ng.access.access_calls_per_sec 252.58 ± 35% -45.5% 137.68 ± 74% numa-vmstat.node0.nr_inactive_file 252.58 ± 35% -45.5% 137.68 ± 74% numa-vmstat.node0.nr_zone_inactive_file 4.08 +3.5% 4.23 perf-stat.i.cpi 4.10 +3.2% 4.24 perf-stat.overall.cpi 0.24 -3.1% 0.24 perf-stat.overall.ipc 3.326e+12 -3.2% 3.22e+12 perf-stat.total.instructions 2.33 ± 5% -0.2 2.10 ± 4% perf-profile.calltrace.cycles-pp.syscall 1.85 ± 5% -0.2 1.63 ± 4% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.faccessat 1.86 ± 5% -0.2 1.65 ± 4% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.syscall 1.76 ± 5% -0.2 1.55 ± 4% perf-profile.calltrace.cycles-pp.do_faccessat.do_syscall_64.entry_SYSCALL_64_after_hwframe.faccessat 1.83 ± 5% -0.2 1.62 ± 4% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall 1.87 ± 5% -0.2 1.66 ± 4% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.faccessat 1.73 ± 5% -0.2 1.52 ± 4% perf-profile.calltrace.cycles-pp.do_faccessat.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall 1.48 ± 5% -0.2 1.27 ± 4% perf-profile.calltrace.cycles-pp.user_path_at_empty.do_faccessat.do_syscall_64.entry_SYSCALL_64_after_hwframe.syscall 1.49 ± 5% -0.2 1.29 ± 4% perf-profile.calltrace.cycles-pp.user_path_at_empty.do_faccessat.do_syscall_64.entry_SYSCALL_64_after_hwframe.faccessat 2.19 ± 2% -0.2 2.02 ± 3% perf-profile.calltrace.cycles-pp.access 1.84 ± 2% -0.2 1.67 ± 3% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.access 1.86 ± 2% -0.2 1.69 ± 3% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.access 1.76 ± 2% -0.2 1.59 ± 3% perf-profile.calltrace.cycles-pp.do_faccessat.do_syscall_64.entry_SYSCALL_64_after_hwframe.access 1.40 ± 2% -0.2 1.24 ± 3% perf-profile.calltrace.cycles-pp.user_path_at_empty.do_faccessat.do_syscall_64.entry_SYSCALL_64_after_hwframe.access 4.91 ± 4% -0.6 4.28 ± 3% perf-profile.children.cycles-pp.user_path_at_empty 5.28 ± 4% -0.6 4.70 ± 3% perf-profile.children.cycles-pp.do_faccessat 1.39 ± 4% -0.5 0.84 ± 3% perf-profile.children.cycles-pp.getname_flags 0.95 ± 4% -0.5 0.41 ± 3% perf-profile.children.cycles-pp.strncpy_from_user 2.41 ± 5% -0.2 2.19 ± 4% perf-profile.children.cycles-pp.syscall 2.25 ± 2% -0.2 2.08 ± 3% perf-profile.children.cycles-pp.access 0.12 ± 6% -0.0 0.09 ± 5% perf-profile.children.cycles-pp.btrfs_init_metadata_block_rsv 0.08 ± 8% -0.0 0.05 ± 8% perf-profile.children.cycles-pp.btrfs_find_space_info 0.10 ± 4% -0.0 0.08 ± 5% perf-profile.children.cycles-pp.fill_stack_inode_item 0.48 ± 3% -0.1 0.40 ± 3% perf-profile.self.cycles-pp.strncpy_from_user 0.08 ± 8% -0.0 0.05 ± 7% perf-profile.self.cycles-pp.btrfs_find_space_info Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki