Hello, kernel test robot noticed a -4.7% regression of stress-ng.judy.ops_per_sec on: commit: 6ccdcb6d3a741c4e005ca6ffd4a62ddf8b5bead3 ("mm, pcp: reduce detecting time of consecutive high order page freeing") https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master testcase: stress-ng test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 512G memory parameters: nr_threads: 100% testtime: 60s class: cpu-cache test: judy disk: 1SSD cpufreq_governor: performance In addition to that, the commit also has significant impact on the following tests: +------------------+-------------------------------------------------------------------------------------------------+ | testcase: change | lmbench3: lmbench3.TCP.socket.bandwidth.10MB.MB/sec 23.7% improvement | | test machine | 48 threads 2 sockets Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz (Ivy Bridge-EP) with 112G memory | | test parameters | cpufreq_governor=performance | | | mode=development | | | nr_threads=100% | | | test=TCP | | | test_memory_size=50% | +------------------+-------------------------------------------------------------------------------------------------+ | testcase: change | stress-ng: stress-ng.file-ioctl.ops_per_sec -6.6% regression | | test machine | 36 threads 1 sockets Intel(R) Core(TM) i9-9980XE CPU @ 3.00GHz (Skylake) with 32G memory | | test parameters | class=filesystem | | | cpufreq_governor=performance | | | disk=1SSD | | | fs=btrfs | | | nr_threads=10% | | | test=file-ioctl | | | testtime=60s | +------------------+-------------------------------------------------------------------------------------------------+ If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <oliver.sang@xxxxxxxxx> | Closes: https://lore.kernel.org/oe-lkp/202311231029.3aa790-oliver.sang@xxxxxxxxx Details are as below: --------------------------------------------------------------------------------------------------> The kernel config and materials to reproduce are available at: https://download.01.org/0day-ci/archive/20231123/202311231029.3aa790-oliver.sang@xxxxxxxxx ========================================================================================= class/compiler/cpufreq_governor/disk/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: cpu-cache/gcc-12/performance/1SSD/x86_64-rhel-8.3/100%/debian-11.1-x86_64-20220510.cgz/lkp-spr-2sp4/judy/stress-ng/60s commit: 57c0419c5f ("mm, pcp: decrease PCP high if free pages < high watermark") 6ccdcb6d3a ("mm, pcp: reduce detecting time of consecutive high order page freeing") 57c0419c5f0ea2cc 6ccdcb6d3a741c4e005ca6ffd4a ---------------- --------------------------- %stddev %change %stddev \ | \ 4.57 ± 5% +46.8% 6.71 ± 17% iostat.cpu.system 2842 +1.0% 2871 turbostat.Bzy_MHz 0.12 ± 3% +0.4 0.55 ± 26% mpstat.cpu.all.soft% 3.05 ± 6% +1.8 4.86 ± 20% mpstat.cpu.all.sys% 81120642 -2.9% 78746159 proc-vmstat.numa_hit 80886548 -2.9% 78513494 proc-vmstat.numa_local 82771023 -2.9% 80399459 proc-vmstat.pgalloc_normal 82356596 -2.9% 79991041 proc-vmstat.pgfree 12325708 ± 3% +5.3% 12974746 perf-stat.i.dTLB-load-misses 0.38 ± 44% +27.2% 0.48 perf-stat.overall.cpi 668.74 ± 44% +24.7% 834.02 perf-stat.overall.cycles-between-cache-misses 0.00 ± 45% +0.0 0.01 ± 10% perf-stat.overall.dTLB-load-miss-rate% 10040254 ± 44% +26.0% 12650801 perf-stat.ps.dTLB-load-misses 7036371 ± 3% -2.8% 6842720 stress-ng.judy.Judy_delete_operations_per_sec 9244466 ± 3% -7.8% 8524505 ± 3% stress-ng.judy.Judy_insert_operations_per_sec 2912 ± 3% -4.7% 2774 stress-ng.judy.ops_per_sec 13316 ± 8% +22.8% 16355 ± 13% stress-ng.time.maximum_resident_set_size 445.86 ± 5% +64.2% 732.21 ± 15% stress-ng.time.system_time 40885 ± 40% +373.8% 193712 ± 11% sched_debug.cfs_rq:/.left_vruntime.avg 465264 ± 31% +142.5% 1128399 ± 5% sched_debug.cfs_rq:/.left_vruntime.stddev 8322 ± 34% +140.8% 20039 ± 17% sched_debug.cfs_rq:/.load.avg 40886 ± 40% +373.8% 193713 ± 11% sched_debug.cfs_rq:/.right_vruntime.avg 465274 ± 31% +142.5% 1128401 ± 5% sched_debug.cfs_rq:/.right_vruntime.stddev 818.77 ± 10% +43.3% 1172 ± 5% sched_debug.cpu.curr->pid.stddev 0.05 ± 74% +659.6% 0.41 ± 35% perf-sched.sch_delay.avg.ms.do_task_dead.do_exit.do_group_exit.__x64_sys_exit_group.do_syscall_64 0.10 ± 48% +140.3% 0.24 ± 11% perf-sched.sch_delay.avg.ms.do_wait.kernel_wait4.__do_sys_wait4.do_syscall_64 0.01 ± 14% +102.6% 0.03 ± 29% perf-sched.sch_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm 0.05 ±122% +1322.6% 0.65 ± 20% perf-sched.sch_delay.avg.ms.wait_for_partner.fifo_open.do_dentry_open.do_open 1.70 ± 79% +729.3% 14.10 ± 48% perf-sched.sch_delay.max.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm 1.08 ±101% +233.4% 3.60 ± 7% perf-sched.sch_delay.max.ms.wait_for_partner.fifo_open.do_dentry_open.do_open 0.01 ± 8% +54.7% 0.02 ± 18% perf-sched.total_sch_delay.average.ms 0.18 ± 5% +555.7% 1.20 ± 38% perf-sched.wait_and_delay.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt 0.21 ± 4% +524.6% 1.29 ± 47% perf-sched.wait_and_delay.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi 235.65 ± 31% -57.0% 101.40 ± 17% perf-sched.wait_and_delay.avg.ms.pipe_read.vfs_read.ksys_read.do_syscall_64 127.50 ±100% +126.3% 288.50 ± 9% perf-sched.wait_and_delay.count.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_call_function_single 125.83 ±144% +407.2% 638.17 ± 27% perf-sched.wait_and_delay.count.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64 344.50 ± 36% +114.6% 739.33 ± 24% perf-sched.wait_and_delay.count.pipe_read.vfs_read.ksys_read.do_syscall_64 0.92 ±114% +482.2% 5.38 ± 47% perf-sched.wait_and_delay.max.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_call_function_single 3.22 ± 89% +223.9% 10.44 ± 50% perf-sched.wait_and_delay.max.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi 0.18 ± 43% +471.8% 1.01 ± 36% perf-sched.wait_time.avg.ms.__cond_resched.__alloc_pages.__folio_alloc.vma_alloc_folio.do_anonymous_page 34.39 ± 46% +88.8% 64.95 ± 18% perf-sched.wait_time.avg.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm 0.21 ± 13% +813.6% 1.95 ± 38% perf-sched.wait_time.avg.ms.__cond_resched.tlb_batch_pages_flush.tlb_finish_mmu.unmap_region.constprop 0.18 ± 15% +457.1% 1.02 ± 58% perf-sched.wait_time.avg.ms.__cond_resched.unmap_vmas.unmap_region.constprop.0 417.61 ± 68% -87.6% 51.85 ±146% perf-sched.wait_time.avg.ms.__x64_sys_pause.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown] 0.22 ± 25% +614.2% 1.57 ± 71% perf-sched.wait_time.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_exc_page_fault 0.18 ± 5% +556.3% 1.20 ± 38% perf-sched.wait_time.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt 0.21 ± 4% +524.6% 1.29 ± 47% perf-sched.wait_time.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi 38.72 ± 39% -53.1% 18.17 ± 30% perf-sched.wait_time.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64 235.60 ± 31% -57.0% 101.37 ± 17% perf-sched.wait_time.avg.ms.pipe_read.vfs_read.ksys_read.do_syscall_64 2.17 ± 30% +45.3% 3.16 ± 13% perf-sched.wait_time.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone 1.02 ±131% +574.3% 6.90 ± 52% perf-sched.wait_time.max.ms.__cond_resched.__alloc_pages.__folio_alloc.vma_alloc_folio.do_anonymous_page 0.18 ±191% +92359.0% 169.05 ±219% perf-sched.wait_time.max.ms.__cond_resched.dput.__fput.__x64_sys_close.do_syscall_64 69.64 ± 44% +33.2% 92.76 ± 4% perf-sched.wait_time.max.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm 0.64 ± 67% +653.6% 4.82 ± 54% perf-sched.wait_time.max.ms.__cond_resched.unmap_vmas.unmap_region.constprop.0 1.75 ± 49% +206.5% 5.38 ± 47% perf-sched.wait_time.max.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_call_function_single 3.22 ± 89% +223.9% 10.44 ± 50% perf-sched.wait_time.max.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi *************************************************************************************************** lkp-ivb-2ep1: 48 threads 2 sockets Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz (Ivy Bridge-EP) with 112G memory ========================================================================================= compiler/cpufreq_governor/kconfig/mode/nr_threads/rootfs/tbox_group/test/test_memory_size/testcase: gcc-12/performance/x86_64-rhel-8.3/development/100%/debian-11.1-x86_64-20220510.cgz/lkp-ivb-2ep1/TCP/50%/lmbench3 commit: 57c0419c5f ("mm, pcp: decrease PCP high if free pages < high watermark") 6ccdcb6d3a ("mm, pcp: reduce detecting time of consecutive high order page freeing") 57c0419c5f0ea2cc 6ccdcb6d3a741c4e005ca6ffd4a ---------------- --------------------------- %stddev %change %stddev \ | \ 0.07 ± 38% +105.0% 0.14 ± 32% perf-sched.sch_delay.max.ms.wait_for_partner.fifo_open.do_dentry_open.do_open 26.75 -4.9% 25.45 turbostat.RAMWatt 678809 +7.2% 727594 ± 2% vmstat.system.cs 97929782 -13.1% 85054266 numa-numastat.node0.local_node 97933343 -13.1% 85056081 numa-numastat.node0.numa_hit 97933344 -13.1% 85055901 numa-vmstat.node0.numa_hit 97929783 -13.1% 85054086 numa-vmstat.node0.numa_local 32188 +23.7% 39813 lmbench3.TCP.socket.bandwidth.10MB.MB/sec 652.63 -4.4% 624.04 lmbench3.time.elapsed_time 652.63 -4.4% 624.04 lmbench3.time.elapsed_time.max 8597 -5.9% 8092 lmbench3.time.system_time 0.88 ± 7% -0.1 0.76 ± 5% perf-profile.calltrace.cycles-pp.security_file_permission.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.71 ± 10% -0.1 0.61 ± 7% perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.vfs_read.ksys_read.do_syscall_64 0.78 ± 3% -0.1 0.70 ± 6% perf-profile.children.cycles-pp.security_socket_recvmsg 0.36 ± 9% +0.1 0.42 ± 11% perf-profile.children.cycles-pp.skb_page_frag_refill 0.40 ± 10% +0.1 0.48 ± 12% perf-profile.children.cycles-pp.sk_page_frag_refill 0.51 ± 4% -0.1 0.44 ± 13% perf-profile.self.cycles-pp.sock_read_iter 0.36 ± 10% +0.1 0.42 ± 11% perf-profile.self.cycles-pp.skb_page_frag_refill 158897 ± 2% -6.8% 148107 proc-vmstat.nr_anon_pages 160213 ± 2% -6.8% 149290 proc-vmstat.nr_inactive_anon 160213 ± 2% -6.8% 149290 proc-vmstat.nr_zone_inactive_anon 1.715e+08 -7.1% 1.593e+08 proc-vmstat.numa_hit 1.715e+08 -7.1% 1.592e+08 proc-vmstat.numa_local 1.367e+09 -7.1% 1.27e+09 proc-vmstat.pgalloc_normal 2324641 -2.7% 2261187 proc-vmstat.pgfault 1.367e+09 -7.1% 1.27e+09 proc-vmstat.pgfree 77011 -4.4% 73597 proc-vmstat.pgreuse 5.99 ± 3% -29.9% 4.20 ± 4% perf-stat.i.MPKI 7.914e+09 ± 2% +4.5% 8.271e+09 perf-stat.i.branch-instructions 1.51e+08 +4.6% 1.579e+08 perf-stat.i.branch-misses 7.65 ± 4% -0.9 6.73 ± 3% perf-stat.i.cache-miss-rate% 66394790 ± 2% -21.9% 51865866 ± 3% perf-stat.i.cache-misses 682132 +7.2% 731279 ± 2% perf-stat.i.context-switches 4.01 -16.0% 3.37 perf-stat.i.cpi 71772 ± 4% +11.5% 80055 ± 8% perf-stat.i.cycles-between-cache-misses 9.368e+09 ± 2% +3.6% 9.706e+09 perf-stat.i.dTLB-stores 33695419 ± 2% +7.1% 36096466 ± 2% perf-stat.i.iTLB-load-misses 573897 ± 35% -38.6% 352477 ± 19% perf-stat.i.iTLB-loads 4.09e+10 ± 2% +4.5% 4.273e+10 perf-stat.i.instructions 0.37 +4.3% 0.39 perf-stat.i.ipc 0.09 ± 22% -44.0% 0.05 ± 26% perf-stat.i.major-faults 490.16 ± 2% -8.6% 448.21 ± 2% perf-stat.i.metric.K/sec 635.38 ± 2% +3.5% 657.46 perf-stat.i.metric.M/sec 37.54 +2.3 39.84 perf-stat.i.node-load-miss-rate% 8300835 ± 2% -10.8% 7406820 ± 2% perf-stat.i.node-load-misses 76993977 ± 3% -6.6% 71936169 ± 3% perf-stat.i.node-loads 26.58 ± 4% +4.1 30.71 ± 3% perf-stat.i.node-store-miss-rate% 2341211 ± 4% -29.6% 1648802 ± 3% perf-stat.i.node-store-misses 34198780 ± 3% -33.2% 22857201 ± 3% perf-stat.i.node-stores 1.63 -25.5% 1.21 ± 3% perf-stat.overall.MPKI 10.67 -2.3 8.36 perf-stat.overall.cache-miss-rate% 2.83 -5.2% 2.69 perf-stat.overall.cpi 1740 +27.3% 2216 ± 3% perf-stat.overall.cycles-between-cache-misses 0.35 +5.5% 0.37 perf-stat.overall.ipc 9.73 -0.4 9.34 perf-stat.overall.node-load-miss-rate% 6.39 +0.3 6.72 perf-stat.overall.node-store-miss-rate% 7.914e+09 ± 2% +4.6% 8.276e+09 perf-stat.ps.branch-instructions 1.509e+08 +4.7% 1.579e+08 perf-stat.ps.branch-misses 66615187 ± 2% -22.1% 51881477 ± 3% perf-stat.ps.cache-misses 679734 +7.2% 729007 ± 2% perf-stat.ps.context-switches 9.369e+09 ± 2% +3.7% 9.712e+09 perf-stat.ps.dTLB-stores 33673038 ± 2% +7.2% 36098564 ± 2% perf-stat.ps.iTLB-load-misses 4.09e+10 ± 2% +4.6% 4.276e+10 perf-stat.ps.instructions 0.09 ± 23% -44.4% 0.05 ± 26% perf-stat.ps.major-faults 8328473 ± 2% -11.0% 7410272 ± 2% perf-stat.ps.node-load-misses 77301667 ± 3% -6.9% 71997671 ± 3% perf-stat.ps.node-loads 2344250 ± 4% -29.7% 1647553 ± 3% perf-stat.ps.node-store-misses 34315831 ± 3% -33.4% 22865994 ± 3% perf-stat.ps.node-stores *************************************************************************************************** lkp-skl-d08: 36 threads 1 sockets Intel(R) Core(TM) i9-9980XE CPU @ 3.00GHz (Skylake) with 32G memory ========================================================================================= class/compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime: filesystem/gcc-12/performance/1SSD/btrfs/x86_64-rhel-8.3/10%/debian-11.1-x86_64-20220510.cgz/lkp-skl-d08/file-ioctl/stress-ng/60s commit: 57c0419c5f ("mm, pcp: decrease PCP high if free pages < high watermark") 6ccdcb6d3a ("mm, pcp: reduce detecting time of consecutive high order page freeing") 57c0419c5f0ea2cc 6ccdcb6d3a741c4e005ca6ffd4a ---------------- --------------------------- %stddev %change %stddev \ | \ 127.00 ± 10% +36.1% 172.83 ± 15% perf-c2c.HITM.local 0.00 ± 72% +130.4% 0.01 ± 30% perf-sched.wait_time.max.ms.__cond_resched.kmem_cache_alloc.alloc_extent_state.__clear_extent_bit.btrfs_clone_files 14.83 ± 19% +33.7% 19.83 ± 10% sched_debug.cpu.nr_uninterruptible.max 339939 -6.6% 317593 stress-ng.file-ioctl.ops 5665 -6.6% 5293 stress-ng.file-ioctl.ops_per_sec 6444 ± 4% -25.2% 4820 ± 5% stress-ng.time.involuntary_context_switches 89198237 -6.5% 83411572 proc-vmstat.numa_hit 89117176 -6.8% 83056324 proc-vmstat.numa_local 92833230 -6.6% 86743293 proc-vmstat.pgalloc_normal 92791999 -6.6% 86700599 proc-vmstat.pgfree 0.25 ± 56% +110.2% 0.53 ± 12% perf-stat.i.major-faults 127575 ± 27% +138.3% 303957 ± 3% perf-stat.i.node-stores 0.25 ± 56% +110.2% 0.52 ± 12% perf-stat.ps.major-faults 125751 ± 27% +138.3% 299653 ± 3% perf-stat.ps.node-stores 1.199e+12 -2.1% 1.174e+12 perf-stat.total.instructions 15.80 -0.7 15.14 perf-profile.calltrace.cycles-pp.filemap_read_folio.do_read_cache_folio.vfs_dedupe_file_range_compare.__generic_remap_file_range_prep.generic_remap_file_range_prep 15.46 -0.6 14.84 perf-profile.calltrace.cycles-pp.btrfs_read_folio.filemap_read_folio.do_read_cache_folio.vfs_dedupe_file_range_compare.__generic_remap_file_range_prep 9.84 -0.5 9.32 perf-profile.calltrace.cycles-pp.memcmp.vfs_dedupe_file_range_compare.__generic_remap_file_range_prep.generic_remap_file_range_prep.btrfs_remap_file_range 11.95 -0.4 11.52 perf-profile.calltrace.cycles-pp.btrfs_do_readpage.btrfs_read_folio.filemap_read_folio.do_read_cache_folio.vfs_dedupe_file_range_compare 8.72 ± 2% -0.4 8.28 perf-profile.calltrace.cycles-pp.filemap_add_folio.do_read_cache_folio.vfs_dedupe_file_range_compare.__generic_remap_file_range_prep.generic_remap_file_range_prep 5.56 ± 2% -0.4 5.18 perf-profile.calltrace.cycles-pp.__filemap_add_folio.filemap_add_folio.do_read_cache_folio.vfs_dedupe_file_range_compare.__generic_remap_file_range_prep 0.64 ± 10% -0.3 0.36 ± 71% perf-profile.calltrace.cycles-pp.find_free_extent.btrfs_reserve_extent.__btrfs_prealloc_file_range.btrfs_prealloc_file_range.btrfs_fallocate 2.57 ± 5% -0.3 2.29 ± 2% perf-profile.calltrace.cycles-pp.ioctl_preallocate.__x64_sys_ioctl.do_syscall_64.entry_SYSCALL_64_after_hwframe.ioctl 2.44 ± 6% -0.3 2.17 ± 2% perf-profile.calltrace.cycles-pp.btrfs_fallocate.vfs_fallocate.ioctl_preallocate.__x64_sys_ioctl.do_syscall_64 2.53 ± 5% -0.3 2.26 ± 2% perf-profile.calltrace.cycles-pp.vfs_fallocate.ioctl_preallocate.__x64_sys_ioctl.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.66 ± 9% -0.2 0.46 ± 45% perf-profile.calltrace.cycles-pp.btrfs_reserve_extent.__btrfs_prealloc_file_range.btrfs_prealloc_file_range.btrfs_fallocate.vfs_fallocate 1.42 ± 3% -0.1 1.31 ± 4% perf-profile.calltrace.cycles-pp.clear_state_bit.__clear_extent_bit.btrfs_invalidate_folio.truncate_cleanup_folio.truncate_inode_pages_range 0.70 ± 4% -0.1 0.62 ± 2% perf-profile.calltrace.cycles-pp.__mod_lruvec_page_state.__filemap_add_folio.filemap_add_folio.do_read_cache_folio.vfs_dedupe_file_range_compare 0.69 ± 4% -0.1 0.63 ± 4% perf-profile.calltrace.cycles-pp.btrfs_punch_hole.btrfs_fallocate.vfs_fallocate.ioctl_preallocate.__x64_sys_ioctl 29.90 +0.6 30.49 perf-profile.calltrace.cycles-pp.do_read_cache_folio.vfs_dedupe_file_range_compare.__generic_remap_file_range_prep.generic_remap_file_range_prep.btrfs_remap_file_range 0.00 +0.9 0.86 ± 6% perf-profile.calltrace.cycles-pp.__list_del_entry_valid_or_report.rmqueue_bulk.__rmqueue_pcplist.rmqueue.get_page_from_freelist 68.10 +1.2 69.29 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.ioctl.stress_run 68.47 +1.2 69.68 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.ioctl.stress_run 67.35 +1.2 68.59 perf-profile.calltrace.cycles-pp.__x64_sys_ioctl.do_syscall_64.entry_SYSCALL_64_after_hwframe.ioctl.stress_run 21.54 ± 3% +1.5 23.02 perf-profile.calltrace.cycles-pp.ioctl_file_clone.do_vfs_ioctl.__x64_sys_ioctl.do_syscall_64.entry_SYSCALL_64_after_hwframe 21.51 ± 3% +1.5 23.00 perf-profile.calltrace.cycles-pp.do_clone_file_range.vfs_clone_file_range.ioctl_file_clone.do_vfs_ioctl.__x64_sys_ioctl 21.46 ± 3% +1.5 22.94 perf-profile.calltrace.cycles-pp.btrfs_remap_file_range.do_clone_file_range.vfs_clone_file_range.ioctl_file_clone.do_vfs_ioctl 21.53 ± 3% +1.5 23.01 perf-profile.calltrace.cycles-pp.vfs_clone_file_range.ioctl_file_clone.do_vfs_ioctl.__x64_sys_ioctl.do_syscall_64 0.00 +1.5 1.49 ± 3% perf-profile.calltrace.cycles-pp.__free_one_page.free_pcppages_bulk.free_unref_page_commit.free_unref_page.btrfs_clone 21.15 ± 3% +1.5 22.66 perf-profile.calltrace.cycles-pp.btrfs_clone_files.btrfs_remap_file_range.do_clone_file_range.vfs_clone_file_range.ioctl_file_clone 64.61 +1.5 66.16 perf-profile.calltrace.cycles-pp.do_vfs_ioctl.__x64_sys_ioctl.do_syscall_64.entry_SYSCALL_64_after_hwframe.ioctl 2.66 ± 2% +1.8 4.51 ± 3% perf-profile.calltrace.cycles-pp.folio_alloc.do_read_cache_folio.vfs_dedupe_file_range_compare.__generic_remap_file_range_prep.generic_remap_file_range_prep 0.97 ± 3% +1.8 2.82 ± 5% perf-profile.calltrace.cycles-pp.rmqueue.get_page_from_freelist.__alloc_pages.folio_alloc.do_read_cache_folio 2.02 ± 3% +1.9 3.90 ± 4% perf-profile.calltrace.cycles-pp.__alloc_pages.folio_alloc.do_read_cache_folio.vfs_dedupe_file_range_compare.__generic_remap_file_range_prep 1.27 ± 2% +1.9 3.17 ± 4% perf-profile.calltrace.cycles-pp.get_page_from_freelist.__alloc_pages.folio_alloc.do_read_cache_folio.vfs_dedupe_file_range_compare 0.35 ± 70% +2.0 2.31 ± 5% perf-profile.calltrace.cycles-pp.__rmqueue_pcplist.rmqueue.get_page_from_freelist.__alloc_pages.folio_alloc 0.00 +2.0 2.00 ± 4% perf-profile.calltrace.cycles-pp.rmqueue_bulk.__rmqueue_pcplist.rmqueue.get_page_from_freelist.__alloc_pages 1.72 ± 2% +2.1 3.78 perf-profile.calltrace.cycles-pp.btrfs_clone.btrfs_clone_files.btrfs_remap_file_range.do_clone_file_range.vfs_clone_file_range 0.00 +2.1 2.09 ± 2% perf-profile.calltrace.cycles-pp.free_pcppages_bulk.free_unref_page_commit.free_unref_page.btrfs_clone.btrfs_clone_files 0.00 +2.1 2.12 ± 2% perf-profile.calltrace.cycles-pp.free_unref_page_commit.free_unref_page.btrfs_clone.btrfs_clone_files.btrfs_remap_file_range 0.00 +2.1 2.14 ± 2% perf-profile.calltrace.cycles-pp.free_unref_page.btrfs_clone.btrfs_clone_files.btrfs_remap_file_range.do_clone_file_range 15.81 -0.7 15.15 perf-profile.children.cycles-pp.filemap_read_folio 15.47 -0.6 14.86 perf-profile.children.cycles-pp.btrfs_read_folio 9.89 -0.5 9.38 perf-profile.children.cycles-pp.memcmp 11.98 -0.4 11.54 perf-profile.children.cycles-pp.btrfs_do_readpage 8.74 ± 2% -0.4 8.30 perf-profile.children.cycles-pp.filemap_add_folio 9.73 ± 3% -0.4 9.35 perf-profile.children.cycles-pp.__clear_extent_bit 5.66 ± 2% -0.4 5.30 perf-profile.children.cycles-pp.__filemap_add_folio 2.45 ± 6% -0.3 2.17 ± 2% perf-profile.children.cycles-pp.btrfs_fallocate 2.57 ± 5% -0.3 2.29 ± 2% perf-profile.children.cycles-pp.ioctl_preallocate 2.53 ± 5% -0.3 2.26 ± 2% perf-profile.children.cycles-pp.vfs_fallocate 4.67 ± 2% -0.3 4.41 ± 3% perf-profile.children.cycles-pp.__set_extent_bit 4.83 ± 2% -0.3 4.58 ± 3% perf-profile.children.cycles-pp.lock_extent 5.06 ± 2% -0.2 4.82 ± 2% perf-profile.children.cycles-pp.alloc_extent_state 4.11 ± 2% -0.2 3.94 ± 2% perf-profile.children.cycles-pp.kmem_cache_alloc 1.37 ± 4% -0.1 1.25 ± 2% perf-profile.children.cycles-pp.__mod_lruvec_page_state 0.66 ± 9% -0.1 0.54 ± 6% perf-profile.children.cycles-pp.btrfs_reserve_extent 0.64 ± 10% -0.1 0.53 ± 6% perf-profile.children.cycles-pp.find_free_extent 0.96 ± 4% -0.1 0.87 ± 6% perf-profile.children.cycles-pp.__wake_up 0.62 ± 4% -0.1 0.54 ± 6% perf-profile.children.cycles-pp.__cond_resched 1.20 ± 4% -0.1 1.12 ± 3% perf-profile.children.cycles-pp.free_extent_state 0.99 ± 3% -0.1 0.92 ± 4% perf-profile.children.cycles-pp.syscall_return_via_sysret 0.89 ± 3% -0.1 0.81 ± 5% perf-profile.children.cycles-pp.syscall_exit_to_user_mode 0.69 ± 4% -0.1 0.64 ± 4% perf-profile.children.cycles-pp.btrfs_punch_hole 0.12 ± 10% -0.0 0.09 ± 10% perf-profile.children.cycles-pp.__fget_light 0.02 ±141% +0.0 0.06 ± 13% perf-profile.children.cycles-pp.calc_available_free_space 0.29 ± 8% +0.1 0.39 ± 6% perf-profile.children.cycles-pp.__mod_zone_page_state 0.09 ± 17% +0.2 0.25 ± 6% perf-profile.children.cycles-pp.__kmalloc_node 0.09 ± 15% +0.2 0.25 ± 4% perf-profile.children.cycles-pp.kvmalloc_node 0.08 ± 11% +0.2 0.24 ± 4% perf-profile.children.cycles-pp.__kmalloc_large_node 0.24 ± 13% +0.2 0.41 ± 4% perf-profile.children.cycles-pp.__list_add_valid_or_report 0.32 ± 15% +0.6 0.91 ± 4% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath 30.03 +0.6 30.64 perf-profile.children.cycles-pp.do_read_cache_folio 1.10 ± 4% +0.6 1.72 ± 2% perf-profile.children.cycles-pp._raw_spin_lock_irqsave 0.58 ± 6% +0.9 1.50 ± 5% perf-profile.children.cycles-pp.__list_del_entry_valid_or_report 67.36 +1.2 68.60 perf-profile.children.cycles-pp.__x64_sys_ioctl 21.52 ± 3% +1.5 23.00 perf-profile.children.cycles-pp.do_clone_file_range 21.54 ± 3% +1.5 23.02 perf-profile.children.cycles-pp.ioctl_file_clone 21.53 ± 3% +1.5 23.01 perf-profile.children.cycles-pp.vfs_clone_file_range 21.16 ± 3% +1.5 22.66 perf-profile.children.cycles-pp.btrfs_clone_files 0.00 +1.5 1.52 ± 3% perf-profile.children.cycles-pp.__free_one_page 64.61 +1.5 66.16 perf-profile.children.cycles-pp.do_vfs_ioctl 64.16 +1.5 65.71 perf-profile.children.cycles-pp.btrfs_remap_file_range 2.68 ± 3% +1.8 4.52 ± 3% perf-profile.children.cycles-pp.folio_alloc 0.54 ± 6% +2.0 2.51 ± 5% perf-profile.children.cycles-pp.__rmqueue_pcplist 1.03 ± 3% +2.0 3.04 ± 5% perf-profile.children.cycles-pp.rmqueue 2.16 ± 3% +2.0 4.19 ± 4% perf-profile.children.cycles-pp.__alloc_pages 1.32 ± 2% +2.1 3.42 ± 4% perf-profile.children.cycles-pp.get_page_from_freelist 0.00 +2.1 2.10 ± 2% perf-profile.children.cycles-pp.free_pcppages_bulk 2.66 ± 2% +2.1 4.77 perf-profile.children.cycles-pp.btrfs_clone 0.03 ±100% +2.1 2.17 ± 2% perf-profile.children.cycles-pp.free_unref_page 0.40 ± 6% +2.2 2.55 ± 2% perf-profile.children.cycles-pp.free_unref_page_commit 0.00 +2.2 2.21 ± 4% perf-profile.children.cycles-pp.rmqueue_bulk 9.82 -0.5 9.32 perf-profile.self.cycles-pp.memcmp 0.84 ± 5% -0.1 0.76 ± 6% perf-profile.self.cycles-pp.syscall_exit_to_user_mode 1.13 ± 4% -0.1 1.05 ± 2% perf-profile.self.cycles-pp.free_extent_state 0.99 ± 3% -0.1 0.92 ± 4% perf-profile.self.cycles-pp.syscall_return_via_sysret 0.22 ± 8% -0.1 0.16 ± 13% perf-profile.self.cycles-pp.find_free_extent 0.38 ± 4% -0.1 0.32 ± 8% perf-profile.self.cycles-pp.__cond_resched 0.12 ± 10% -0.0 0.08 ± 11% perf-profile.self.cycles-pp.__fget_light 0.06 ± 7% -0.0 0.04 ± 45% perf-profile.self.cycles-pp.__x64_sys_ioctl 0.07 ± 15% +0.0 0.10 ± 9% perf-profile.self.cycles-pp.folio_alloc 0.28 ± 10% +0.1 0.36 ± 7% perf-profile.self.cycles-pp.get_page_from_freelist 0.26 ± 8% +0.1 0.36 ± 4% perf-profile.self.cycles-pp.__mod_zone_page_state 0.22 ± 14% +0.2 0.38 ± 5% perf-profile.self.cycles-pp.__list_add_valid_or_report 0.00 +0.2 0.24 ± 6% perf-profile.self.cycles-pp.free_pcppages_bulk 0.32 ± 15% +0.6 0.91 ± 4% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath 0.00 +0.6 0.62 ± 10% perf-profile.self.cycles-pp.rmqueue_bulk 0.55 ± 6% +0.9 1.46 ± 5% perf-profile.self.cycles-pp.__list_del_entry_valid_or_report 0.00 +1.3 1.32 ± 4% perf-profile.self.cycles-pp.__free_one_page Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki