Hello, kernel test robot noticed a 10.5% improvement of hackbench.throughput on: commit: f017b0a4951fac8f150232661b2cc0b67e0c57f0 ("pipe: don't update {a,c,m}time for anonymous pipes") https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master testcase: hackbench config: x86_64-rhel-9.4 compiler: gcc-12 test machine: 128 threads 2 sockets Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz (Ice Lake) with 256G memory parameters: nr_threads: 800% iterations: 4 mode: threads ipc: pipe cpufreq_governor: performance Details are as below: --------------------------------------------------------------------------------------------------> The kernel config and materials to reproduce are available at: https://download.01.org/0day-ci/archive/20250214/202502141548.9fa68773-lkp@xxxxxxxxx ========================================================================================= compiler/cpufreq_governor/ipc/iterations/kconfig/mode/nr_threads/rootfs/tbox_group/testcase: gcc-12/performance/pipe/4/x86_64-rhel-9.4/threads/800%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp2/hackbench commit: 262b2fa99c ("pipe: introduce struct file_operations pipeanon_fops") f017b0a495 ("pipe: don't update {a,c,m}time for anonymous pipes") 262b2fa99cbe02a7 f017b0a4951fac8f150232661b2 ---------------- --------------------------- %stddev %change %stddev \ | \ 319054 -2.8% 310139 proc-vmstat.nr_active_anon 319054 -2.8% 310139 proc-vmstat.nr_zone_active_anon 549457 ± 92% -94.4% 30640 ± 30% sched_debug.cfs_rq:/.load.max 49885 ± 87% -88.9% 5535 ± 17% sched_debug.cfs_rq:/.load.stddev 1266298 +10.5% 1399088 hackbench.throughput 1237971 ± 2% +10.0% 1361485 ± 2% hackbench.throughput_avg 1266298 +10.5% 1399088 hackbench.throughput_best 4837 ± 2% -11.3% 4289 ± 2% hackbench.time.system_time 6.114e+10 -4.2% 5.86e+10 perf-stat.i.branch-instructions 2.74e+11 -2.0% 2.686e+11 perf-stat.i.cpu-cycles 1167 ± 3% -7.4% 1080 ± 3% perf-stat.i.cycles-between-cache-misses 2.527e+11 -6.0% 2.376e+11 perf-stat.i.instructions 0.87 ± 3% +15.0% 1.00 ± 4% perf-stat.overall.MPKI 1.07 +4.2% 1.12 perf-stat.overall.cpi 1233 ± 3% -9.3% 1118 ± 4% perf-stat.overall.cycles-between-cache-misses 0.93 -4.0% 0.89 perf-stat.overall.ipc 6.45e+10 -4.5% 6.161e+10 perf-stat.ps.branch-instructions 2.318e+08 ± 2% +7.7% 2.496e+08 ± 4% perf-stat.ps.cache-misses 2.856e+11 -2.4% 2.788e+11 perf-stat.ps.cpu-cycles 2.662e+11 -6.3% 2.494e+11 perf-stat.ps.instructions 10565 ± 3% +8.0% 11409 ± 2% perf-stat.ps.minor-faults 10565 ± 3% +8.0% 11409 ± 2% perf-stat.ps.page-faults 1.435e+13 -14.2% 1.232e+13 perf-stat.total.instructions 299.84 ± 47% -100.0% 0.00 perf-sched.sch_delay.avg.ms.__cond_resched.__alloc_frozen_pages_noprof.alloc_pages_mpol.alloc_pages_noprof.pipe_write 35.32 ± 24% -46.6% 18.84 ± 30% perf-sched.sch_delay.avg.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity 169.52 ± 79% -100.0% 0.00 perf-sched.sch_delay.avg.ms.__cond_resched.mutex_lock.pipe_read.vfs_read.ksys_read 308.81 ± 34% -100.0% 0.00 perf-sched.sch_delay.avg.ms.__cond_resched.mutex_lock.pipe_write.vfs_write.ksys_write 308.90 ± 30% -47.0% 163.58 ± 19% perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown] 10.16 ±210% -99.7% 0.03 ±115% perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown] 85.33 ± 25% -100.0% 0.00 perf-sched.sch_delay.avg.ms.pipe_read.vfs_read.ksys_read.do_syscall_64 209.12 ± 31% -100.0% 0.00 perf-sched.sch_delay.avg.ms.pipe_write.vfs_write.ksys_write.do_syscall_64 85.21 ± 62% -100.0% 0.00 perf-sched.sch_delay.avg.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.pipe_read 374.84 ± 38% -100.0% 0.00 perf-sched.sch_delay.avg.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.pipe_write 39.29 ± 55% -55.1% 17.63 ± 13% perf-sched.sch_delay.avg.ms.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown] 5455 ± 49% -100.0% 0.00 perf-sched.sch_delay.max.ms.__cond_resched.__alloc_frozen_pages_noprof.alloc_pages_mpol.alloc_pages_noprof.pipe_write 6980 ± 12% -100.0% 0.00 perf-sched.sch_delay.max.ms.__cond_resched.mutex_lock.pipe_read.vfs_read.ksys_read 8278 ± 8% -100.0% 0.00 perf-sched.sch_delay.max.ms.__cond_resched.mutex_lock.pipe_write.vfs_write.ksys_write 8110 ± 9% -36.9% 5114 ± 16% perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown] 8143 ± 12% -100.0% 0.00 perf-sched.sch_delay.max.ms.pipe_read.vfs_read.ksys_read.do_syscall_64 8560 ± 9% -100.0% 0.00 perf-sched.sch_delay.max.ms.pipe_write.vfs_write.ksys_write.do_syscall_64 2455 ±109% -100.0% 0.00 perf-sched.sch_delay.max.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.pipe_read 7556 ± 13% -100.0% 0.00 perf-sched.sch_delay.max.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.pipe_write 8543 ± 11% -37.6% 5332 ± 16% perf-sched.sch_delay.max.ms.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown] 96.49 ± 28% -44.7% 53.38 ± 12% perf-sched.total_sch_delay.average.ms 8719 ± 10% -37.3% 5462 ± 15% perf-sched.total_sch_delay.max.ms 261.40 ± 29% -46.0% 141.08 ± 12% perf-sched.total_wait_and_delay.average.ms 17438 ± 10% -37.9% 10828 ± 16% perf-sched.total_wait_and_delay.max.ms 164.90 ± 30% -46.8% 87.70 ± 13% perf-sched.total_wait_time.average.ms 8862 ± 11% -35.6% 5710 ± 15% perf-sched.total_wait_time.max.ms 846.91 ± 36% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.__cond_resched.__alloc_frozen_pages_noprof.alloc_pages_mpol.alloc_pages_noprof.pipe_write 846.15 ± 37% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.__cond_resched.mutex_lock.pipe_write.vfs_write.ksys_write 858.41 ± 34% -50.4% 426.01 ± 19% perf-sched.wait_and_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown] 227.96 ± 27% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.pipe_read.vfs_read.ksys_read.do_syscall_64 576.02 ± 31% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.pipe_write.vfs_write.ksys_write.do_syscall_64 983.53 ± 40% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.pipe_write 67.17 ± 10% -100.0% 0.00 perf-sched.wait_and_delay.count.__cond_resched.__alloc_frozen_pages_noprof.alloc_pages_mpol.alloc_pages_noprof.pipe_write 7320 ± 6% -100.0% 0.00 perf-sched.wait_and_delay.count.__cond_resched.mutex_lock.pipe_write.vfs_write.ksys_write 752867 ± 2% -100.0% 0.00 perf-sched.wait_and_delay.count.pipe_read.vfs_read.ksys_read.do_syscall_64 96327 ± 3% -100.0% 0.00 perf-sched.wait_and_delay.count.pipe_write.vfs_write.ksys_write.do_syscall_64 1106 ± 10% -100.0% 0.00 perf-sched.wait_and_delay.count.schedule_preempt_disabled.__mutex_lock.constprop.0.pipe_write 11731 ± 36% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.__cond_resched.__alloc_frozen_pages_noprof.alloc_pages_mpol.alloc_pages_noprof.pipe_write 16557 ± 8% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.__cond_resched.mutex_lock.pipe_write.vfs_write.ksys_write 16104 ± 9% -36.4% 10235 ± 16% perf-sched.wait_and_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown] 16318 ± 12% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.pipe_read.vfs_read.ksys_read.do_syscall_64 17121 ± 9% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.pipe_write.vfs_write.ksys_write.do_syscall_64 15123 ± 13% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.pipe_write 8275 ± 15% -33.0% 5544 ± 15% perf-sched.wait_and_delay.max.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm 17047 ± 11% -37.3% 10687 ± 16% perf-sched.wait_and_delay.max.ms.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown] 547.07 ± 33% -100.0% 0.00 perf-sched.wait_time.avg.ms.__cond_resched.__alloc_frozen_pages_noprof.alloc_pages_mpol.alloc_pages_noprof.pipe_write 374.49 ± 48% -100.0% 0.00 perf-sched.wait_time.avg.ms.__cond_resched.__mutex_lock.constprop.0.pipe_write 36.27 ± 19% -44.7% 20.06 ± 25% perf-sched.wait_time.avg.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity 280.11 ± 85% -100.0% 0.00 perf-sched.wait_time.avg.ms.__cond_resched.mutex_lock.pipe_read.vfs_read.ksys_read 537.34 ± 38% -100.0% 0.00 perf-sched.wait_time.avg.ms.__cond_resched.mutex_lock.pipe_write.vfs_write.ksys_write 0.35 ±138% -100.0% 0.00 perf-sched.wait_time.avg.ms.__cond_resched.pipe_read.vfs_read.ksys_read.do_syscall_64 549.51 ± 37% -52.2% 262.43 ± 20% perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown] 10.16 ±210% -99.6% 0.04 ±134% perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown] 142.63 ± 28% -100.0% 0.00 perf-sched.wait_time.avg.ms.pipe_read.vfs_read.ksys_read.do_syscall_64 366.89 ± 31% -100.0% 0.00 perf-sched.wait_time.avg.ms.pipe_write.vfs_write.ksys_write.do_syscall_64 39.52 ± 95% -73.6% 10.44 ± 53% perf-sched.wait_time.avg.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm 149.09 ± 38% -100.0% 0.00 perf-sched.wait_time.avg.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.pipe_read 608.70 ± 42% -100.0% 0.00 perf-sched.wait_time.avg.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.pipe_write 11.89 ±178% +22112.3% 2641 ± 61% perf-sched.wait_time.avg.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write_killable.vm_mmap_pgoff 43.32 ± 27% -61.1% 16.86 ± 29% perf-sched.wait_time.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread 6944 ± 17% -100.0% 0.00 perf-sched.wait_time.max.ms.__cond_resched.__alloc_frozen_pages_noprof.alloc_pages_mpol.alloc_pages_noprof.pipe_write 1676 ±126% -100.0% 0.00 perf-sched.wait_time.max.ms.__cond_resched.__mutex_lock.constprop.0.pipe_write 7277 ± 10% -100.0% 0.00 perf-sched.wait_time.max.ms.__cond_resched.mutex_lock.pipe_read.vfs_read.ksys_read 8328 ± 8% -100.0% 0.00 perf-sched.wait_time.max.ms.__cond_resched.mutex_lock.pipe_write.vfs_write.ksys_write 3.54 ±175% -100.0% 0.00 perf-sched.wait_time.max.ms.__cond_resched.pipe_read.vfs_read.ksys_read.do_syscall_64 8192 ± 9% -37.5% 5122 ± 16% perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown] 7035 ± 5% -68.5% 2216 ± 81% perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_sysvec_call_function_single.[unknown].[unknown] 8490 ± 12% -100.0% 0.00 perf-sched.wait_time.max.ms.pipe_read.vfs_read.ksys_read.do_syscall_64 8581 ± 9% -100.0% 0.00 perf-sched.wait_time.max.ms.pipe_write.vfs_write.ksys_write.do_syscall_64 915.16 ±118% -86.0% 127.99 ± 79% perf-sched.wait_time.max.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm 4449 ± 64% -100.0% 0.00 perf-sched.wait_time.max.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.pipe_read 8142 ± 13% -100.0% 0.00 perf-sched.wait_time.max.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.pipe_write 11.89 ±178% +27553.2% 3288 ± 58% perf-sched.wait_time.max.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write_killable.vm_mmap_pgoff 8275 ± 15% -33.0% 5544 ± 15% perf-sched.wait_time.max.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm 8719 ± 11% -36.0% 5584 ± 16% perf-sched.wait_time.max.ms.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown] 8432 ± 10% -36.3% 5373 ± 19% perf-sched.wait_time.max.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki