Hello, kernel test robot noticed a 1.2% improvement of will-it-scale.per_process_ops on: commit: f1139c8e66d5c618aad04a93a2378ad9586464f9 ("[PATCH v5 1/3] fs/file.c: remove sanity_check and add likely/unlikely in alloc_fd()") url: https://github.com/intel-lab-lkp/linux/commits/Yu-Ma/fs-file-c-remove-sanity_check-and-add-likely-unlikely-in-alloc_fd/20240717-224830 base: https://git.kernel.org/cgit/linux/kernel/git/vfs/vfs.git vfs.all patch link: https://lore.kernel.org/all/20240717145018.3972922-2-yu.ma@xxxxxxxxx/ patch subject: [PATCH v5 1/3] fs/file.c: remove sanity_check and add likely/unlikely in alloc_fd() testcase: will-it-scale test machine: 256 threads 2 sockets GENUINE INTEL(R) XEON(R) (Sierra Forest) with 128G memory parameters: nr_task: 100% mode: process test: dup1 cpufreq_governor: performance Details are as below: --------------------------------------------------------------------------------------------------> The kernel config and materials to reproduce are available at: https://download.01.org/0day-ci/archive/20240806/202408062146.832faa23-oliver.sang@xxxxxxxxx ========================================================================================= compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-13/performance/x86_64-rhel-8.3/process/100%/debian-12-x86_64-20240206.cgz/lkp-srf-2sp1/dup1/will-it-scale commit: 5f30e082ab ("Merge branch 'vfs.iomap' into vfs.all") f1139c8e66 ("fs/file.c: remove sanity_check and add likely/unlikely in alloc_fd()") 5f30e082ab8b3431 f1139c8e66d5c618aad04a93a23 ---------------- --------------------------- %stddev %change %stddev \ | \ 377983 ± 69% +74.1% 658036 ± 17% numa-meminfo.node0.AnonPages 18.17 ± 10% -48.6% 9.33 ± 35% perf-c2c.DRAM.local 8.796e+08 +1.2% 8.903e+08 will-it-scale.256.processes 3436082 +1.2% 3477810 will-it-scale.per_process_ops 8.796e+08 +1.2% 8.903e+08 will-it-scale.workload 1.517e+11 -4.3% 1.452e+11 perf-stat.i.branch-instructions 0.03 ± 8% +0.0 0.04 ± 36% perf-stat.i.branch-miss-rate% 0.93 +3.9% 0.96 perf-stat.i.cpi 7.13e+11 -3.5% 6.88e+11 perf-stat.i.instructions 1.08 -3.4% 1.04 perf-stat.i.ipc 0.93 +3.4% 0.96 perf-stat.overall.cpi 1.08 -3.3% 1.04 perf-stat.overall.ipc 245130 -4.4% 234451 perf-stat.overall.path-length 1.512e+11 -4.3% 1.447e+11 perf-stat.ps.branch-instructions 7.106e+11 -3.5% 6.857e+11 perf-stat.ps.instructions 2.156e+14 -3.2% 2.087e+14 perf-stat.total.instructions 14.90 -0.7 14.20 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.dup 12.01 -0.7 11.32 perf-profile.calltrace.cycles-pp.__x64_sys_dup.do_syscall_64.entry_SYSCALL_64_after_hwframe.dup 16.54 -0.7 15.88 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.dup 6.44 -0.6 5.89 perf-profile.calltrace.cycles-pp.alloc_fd.__x64_sys_dup.do_syscall_64.entry_SYSCALL_64_after_hwframe.dup 2.86 -0.0 2.82 perf-profile.calltrace.cycles-pp.entry_SYSRETQ_unsafe_stack.__close 8.94 -0.0 8.90 perf-profile.calltrace.cycles-pp.filp_flush.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe.__close 7.76 -0.0 7.72 perf-profile.calltrace.cycles-pp.locks_remove_posix.filp_flush.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe 2.58 -0.0 2.54 perf-profile.calltrace.cycles-pp.__fput_sync.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe.__close 1.11 -0.0 1.10 perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__close 1.33 +0.0 1.35 perf-profile.calltrace.cycles-pp.testcase 0.54 +0.0 0.56 perf-profile.calltrace.cycles-pp.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe.__close 0.79 +0.0 0.82 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.dup 1.33 +0.0 1.37 perf-profile.calltrace.cycles-pp.close@plt 2.73 +0.1 2.78 perf-profile.calltrace.cycles-pp._raw_spin_lock.file_close_fd.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe 1.05 +0.1 1.11 perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.dup 4.35 +0.1 4.42 perf-profile.calltrace.cycles-pp.file_close_fd.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe.__close 22.18 +0.3 22.51 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.__close 21.50 ± 2% +1.5 23.02 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.dup 12.10 -0.7 11.39 perf-profile.children.cycles-pp.__x64_sys_dup 34.79 -0.7 34.12 perf-profile.children.cycles-pp.do_syscall_64 38.04 -0.6 37.42 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe 6.48 -0.6 5.90 perf-profile.children.cycles-pp.alloc_fd 1.86 -0.5 1.41 perf-profile.children.cycles-pp.syscall_return_via_sysret 0.57 -0.1 0.47 perf-profile.children.cycles-pp.fd_install 9.11 -0.0 9.07 perf-profile.children.cycles-pp.filp_flush 7.93 -0.0 7.89 perf-profile.children.cycles-pp.locks_remove_posix 2.61 -0.0 2.58 perf-profile.children.cycles-pp.__fput_sync 1.16 +0.0 1.18 perf-profile.children.cycles-pp.x64_sys_call 0.05 +0.0 0.07 ± 13% perf-profile.children.cycles-pp.clockevents_program_event 0.51 +0.0 0.53 perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare 2.17 +0.0 2.20 perf-profile.children.cycles-pp.syscall_exit_to_user_mode 5.72 +0.0 5.75 perf-profile.children.cycles-pp._raw_spin_lock 2.10 +0.0 2.13 perf-profile.children.cycles-pp.testcase 2.02 +0.0 2.06 perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack 0.13 ± 2% +0.0 0.17 perf-profile.children.cycles-pp.dup@plt 4.38 +0.1 4.46 perf-profile.children.cycles-pp.file_close_fd 23.00 +0.1 23.11 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack 59.27 +0.5 59.73 perf-profile.children.cycles-pp.__close 28.73 +1.1 29.80 perf-profile.children.cycles-pp.entry_SYSCALL_64 1.86 -0.5 1.41 perf-profile.self.cycles-pp.syscall_return_via_sysret 2.28 -0.2 2.12 perf-profile.self.cycles-pp.alloc_fd 0.54 -0.1 0.43 perf-profile.self.cycles-pp.fd_install 7.87 -0.0 7.83 perf-profile.self.cycles-pp.locks_remove_posix 2.47 -0.0 2.44 perf-profile.self.cycles-pp.__fput_sync 1.23 +0.0 1.24 perf-profile.self.cycles-pp.file_close_fd_locked 1.09 +0.0 1.11 perf-profile.self.cycles-pp.x64_sys_call 0.51 +0.0 0.53 perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare 1.29 +0.0 1.32 perf-profile.self.cycles-pp.testcase 5.66 +0.0 5.69 perf-profile.self.cycles-pp._raw_spin_lock 1.95 +0.0 1.99 perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack 2.85 +0.0 2.90 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe 0.02 ±141% +0.0 0.06 ± 13% perf-profile.self.cycles-pp.ktime_get 0.00 +0.1 0.07 perf-profile.self.cycles-pp.dup@plt 22.93 +0.1 23.05 perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack 10.11 +0.2 10.34 perf-profile.self.cycles-pp.dup 13.70 +0.3 13.98 perf-profile.self.cycles-pp.entry_SYSCALL_64 9.84 ± 3% +0.7 10.51 perf-profile.self.cycles-pp.__close Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki