Hello, kernel test robot noticed a -11.9% improvement of autonuma-benchmark.numa01_THREAD_ALLOC.seconds on: commit: 1ef5cbb92bdb320c5eb9fdee1a811d22ee9e19fe ("[RFC PATCH V1 2/6] sched/numa: Add disjoint vma unconditional scan logic") url: https://github.com/intel-lab-lkp/linux/commits/Raghavendra-K-T/sched-numa-Move-up-the-access-pid-reset-logic/20230829-141007 base: https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git 2f88c8e802c8b128a155976631f4eb2ce4f3c805 patch link: https://lore.kernel.org/all/87e3c08bd1770dd3e6eee099c01e595f14c76fc3.1693287931.git.raghavendra.kt@xxxxxxx/ patch subject: [RFC PATCH V1 2/6] sched/numa: Add disjoint vma unconditional scan logic testcase: autonuma-benchmark test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 256G memory parameters: iterations: 4x test: numa01_THREAD_ALLOC cpufreq_governor: performance hi, Raghu, the reason there is a separate report for this commit besides https://lore.kernel.org/all/202309102311.84b42068-oliver.sang@xxxxxxxxx/ is due to bisection nature, for one auto-bisect, we so far only could capture one commit for performance change. this auto-bisect is running on another test machine (Sapphire Rapids), and it happened to choose autonuma-benchmark.numa01_THREAD_ALLOC.seconds as indicator to do the bisect, it finally captured "[RFC PATCH V1 2/6] sched/numa: Add disjoint vma unconditional" and from https://lore.kernel.org/all/acf254e9-0207-7030-131f-8a3f520c657b@xxxxxxx/ I noticed you care more about the performance impact of whole patch set, so let me give a summary table as below. firstly, let me give out how we apply your patch again: 68cfe9439a1ba (linux-review/Raghavendra-K-T/sched-numa-Move-up-the-access-pid-reset-logic/20230829-141007) sched/numa: Allow scanning of shared VMAs af46f3c9ca2d1 sched/numa: Allow recently accessed VMAs to be scanned 167773d1ddb5f sched/numa: Increase tasks' access history fc769221b2306 sched/numa: Remove unconditional scan logic using mm numa_scan_seq 1ef5cbb92bdb3 sched/numa: Add disjoint vma unconditional scan logic 2a806eab1c2e1 sched/numa: Move up the access pid reset logic 2f88c8e802c8b (tip/sched/core) sched/eevdf/doc: Modify the documented knob to base_slice_ns as well we have below data on this test machine (full table will be very big, if you want it, please let me know): ========================================================================================= compiler/cpufreq_governor/iterations/kconfig/rootfs/tbox_group/test/testcase: gcc-12/performance/4x/x86_64-rhel-8.3/debian-11.1-x86_64-20220510.cgz/lkp-spr-r02/numa01_THREAD_ALLOC/autonuma-benchmark commit: 2f88c8e802 ("(tip/sched/core) sched/eevdf/doc: Modify the documented knob to base_slice_ns as well") 2a806eab1c ("sched/numa: Move up the access pid reset logic") 1ef5cbb92b ("sched/numa: Add disjoint vma unconditional scan logic") 68cfe9439a ("sched/numa: Allow scanning of shared VMAs") 2f88c8e802c8b128 2a806eab1c2e1c9f0ae39dc0307 1ef5cbb92bdb320c5eb9fdee1a8 68cfe9439a1baa642e05883fa64 ---------------- --------------------------- --------------------------- --------------------------- %stddev %change %stddev %change %stddev %change %stddev \ | \ | \ | \ 271.01 +0.8% 273.24 -0.7% 269.00 -26.4% 199.49 ± 3% autonuma-benchmark.numa01.seconds 76.28 +0.2% 76.44 -11.7% 67.36 ± 6% -46.9% 40.49 ± 5% autonuma-benchmark.numa01_THREAD_ALLOC.seconds 8.11 -0.9% 8.04 -0.7% 8.05 -0.1% 8.10 autonuma-benchmark.numa02.seconds 1425 +0.7% 1434 -3.1% 1381 -30.1% 996.02 ± 2% autonuma-benchmark.time.elapsed_time it has some difference with our previous report on Ice Lake that autonuma-benchmark.numa02.seconds seems keep stable, but autonuma-benchmark.numa01.seconds has more changes. anyway, for both platforms, we see performance improvement consistently in this test along the patch-set. Details are as below: --------------------------------------------------------------------------------------------------> The kernel config and materials to reproduce are available at: https://download.01.org/0day-ci/archive/20230912/202309121417.53f44ad6-oliver.sang@xxxxxxxxx below are normal data we shared in our performance reports. FYI. (you won't see data for autonuma-benchmark.numa01.seconds or autonuma-benchmark.numa02.seconds, since the delta bewteen 2a806eab1c and 1ef5cbb92b are small so our tool won't show them) ========================================================================================= compiler/cpufreq_governor/iterations/kconfig/rootfs/tbox_group/test/testcase: gcc-12/performance/4x/x86_64-rhel-8.3/debian-11.1-x86_64-20220510.cgz/lkp-spr-r02/numa01_THREAD_ALLOC/autonuma-benchmark commit: 2a806eab1c ("sched/numa: Move up the access pid reset logic") 1ef5cbb92b ("sched/numa: Add disjoint vma unconditional scan logic") 2a806eab1c2e1c9f 1ef5cbb92bdb320c5eb9fdee1a8 ---------------- --------------------------- %stddev %change %stddev \ | \ 0.00 ± 79% +0.0 0.00 ± 13% mpstat.cpu.all.iowait% 357.33 ± 12% +90.4% 680.50 ± 30% perf-c2c.DRAM.remote 79.17 ± 14% +34.7% 106.67 ± 18% perf-c2c.HITM.remote 16378 ± 16% +53.9% 25200 ± 22% turbostat.POLL 50.24 +15.4% 57.99 turbostat.RAMWatt 37.04 ±199% -97.2% 1.05 ±141% perf-sched.wait_time.avg.ms.__cond_resched.exit_mmap.__mmput.exit_mm.do_exit 7.46 ± 23% -43.7% 4.20 ± 47% perf-sched.wait_time.avg.ms.__x64_sys_pause.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown] 170.20 ±218% -99.4% 1.05 ±141% perf-sched.wait_time.max.ms.__cond_resched.exit_mmap.__mmput.exit_mm.do_exit 283.88 ± 28% +49.3% 423.88 ± 16% perf-sched.wait_time.max.ms.do_wait.kernel_wait4.__do_sys_wait4.do_syscall_64 189.72 ± 23% +50.9% 286.24 ± 25% perf-sched.wait_time.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread 76.44 -11.9% 67.36 ± 6% autonuma-benchmark.numa01_THREAD_ALLOC.seconds 1434 -3.7% 1381 autonuma-benchmark.time.elapsed_time 1434 -3.7% 1381 autonuma-benchmark.time.elapsed_time.max 1132634 -6.0% 1064224 ± 2% autonuma-benchmark.time.involuntary_context_switches 2532130 ± 2% +4.5% 2645367 ± 2% autonuma-benchmark.time.minor_page_faults 293184 -3.6% 282626 autonuma-benchmark.time.user_time 16101 +41.9% 22846 ± 4% autonuma-benchmark.time.voluntary_context_switches 6.41 ± 52% +3833.7% 251.97 ± 6% sched_debug.cfs_rq:/.util_est_enqueued.avg 401.88 ± 4% +179.2% 1121 ± 3% sched_debug.cfs_rq:/.util_est_enqueued.max 39.18 ± 16% +698.0% 312.66 ± 3% sched_debug.cfs_rq:/.util_est_enqueued.stddev 1662842 +10.5% 1838160 ± 2% sched_debug.cpu.avg_idle.avg 860266 ± 3% -22.4% 667568 ± 11% sched_debug.cpu.avg_idle.min 647306 ± 4% +13.6% 735595 ± 2% sched_debug.cpu.avg_idle.stddev 664890 +10.4% 733919 ± 2% sched_debug.cpu.max_idle_balance_cost.avg 203832 ± 4% +45.7% 296934 ± 4% sched_debug.cpu.max_idle_balance_cost.stddev 58841 ± 19% +205.6% 179845 ± 8% proc-vmstat.numa_hint_faults 47138 ± 20% +145.1% 115557 ± 8% proc-vmstat.numa_hint_faults_local 652.00 ± 27% +5217.2% 34668 ± 10% proc-vmstat.numa_huge_pte_updates 108295 ± 25% +3179.6% 3551657 ± 11% proc-vmstat.numa_pages_migrated 499336 ± 16% +3503.7% 17994636 ± 10% proc-vmstat.numa_pte_updates 108295 ± 25% +3179.6% 3551657 ± 11% proc-vmstat.pgmigrate_success 238140 +6.7% 254200 proc-vmstat.pgreuse 191.00 ± 29% +3488.8% 6854 ± 11% proc-vmstat.thp_migration_success 4331500 -4.5% 4135400 ± 2% proc-vmstat.unevictable_pgs_scanned 0.66 +0.0 0.67 perf-stat.i.branch-miss-rate% 1779997 +3.1% 1835782 perf-stat.i.branch-misses 2096 +1.6% 2128 perf-stat.i.context-switches 219.07 +2.3% 224.02 perf-stat.i.cpu-migrations 163199 -11.6% 144321 ± 2% perf-stat.i.cycles-between-cache-misses 986545 +1.0% 996780 perf-stat.i.dTLB-store-misses 4436 +4.1% 4616 perf-stat.i.minor-faults 42.56 ± 3% +3.4 45.95 perf-stat.i.node-load-miss-rate% 396254 +28.2% 507952 ± 3% perf-stat.i.node-load-misses 4436 +4.1% 4617 perf-stat.i.page-faults 38.37 ± 6% +6.3 44.69 ± 7% perf-stat.overall.node-load-miss-rate% 1734727 +2.3% 1774826 perf-stat.ps.branch-misses 216.66 +2.2% 221.40 perf-stat.ps.cpu-migrations 983143 +1.1% 993856 perf-stat.ps.dTLB-store-misses 4178 +4.3% 4357 perf-stat.ps.minor-faults 384816 +29.9% 499993 ± 4% perf-stat.ps.node-load-misses 4178 +4.3% 4357 perf-stat.ps.page-faults 47.25 ± 24% -32.1 15.11 ±142% perf-profile.calltrace.cycles-pp.reader__read_event.perf_session__process_events.record__finish_output.__cmd_record 40.98 ± 34% -27.0 13.98 ±141% perf-profile.calltrace.cycles-pp.ordered_events__queue.process_simple.reader__read_event.perf_session__process_events.record__finish_output 40.76 ± 34% -26.9 13.90 ±141% perf-profile.calltrace.cycles-pp.queue_event.ordered_events__queue.process_simple.reader__read_event.perf_session__process_events 40.90 ± 36% -26.6 14.32 ±141% perf-profile.calltrace.cycles-pp.process_simple.reader__read_event.perf_session__process_events.record__finish_output.__cmd_record 6.07 ±101% -5.4 0.62 ±223% perf-profile.calltrace.cycles-pp.__ordered_events__flush.perf_session__process_user_event.reader__read_event.perf_session__process_events.record__finish_output 5.76 ±110% -5.1 0.62 ±223% perf-profile.calltrace.cycles-pp.perf_session__process_user_event.reader__read_event.perf_session__process_events.record__finish_output.__cmd_record 5.42 ±101% -4.9 0.48 ±223% perf-profile.calltrace.cycles-pp.perf_session__deliver_event.__ordered_events__flush.perf_session__process_user_event.reader__read_event.perf_session__process_events 0.58 ± 18% +0.4 0.94 ± 18% perf-profile.calltrace.cycles-pp.rebalance_domains.__do_softirq.__irq_exit_rcu.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt 0.49 ± 49% +0.4 0.94 ± 17% perf-profile.calltrace.cycles-pp.load_balance.rebalance_domains.__do_softirq.__irq_exit_rcu.sysvec_apic_timer_interrupt 0.70 ± 25% +0.5 1.21 ± 22% perf-profile.calltrace.cycles-pp.__do_softirq.__irq_exit_rcu.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt 0.71 ± 24% +0.5 1.22 ± 22% perf-profile.calltrace.cycles-pp.__irq_exit_rcu.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt 0.20 ±142% +0.5 0.74 ± 18% perf-profile.calltrace.cycles-pp.sched_setaffinity.__x64_sys_sched_setaffinity.do_syscall_64.entry_SYSCALL_64_after_hwframe.sched_setaffinity 0.64 ± 53% +0.5 1.18 ± 32% perf-profile.calltrace.cycles-pp.task_work_run.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt 0.18 ±141% +0.6 0.74 ± 19% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__libc_read.readn.perf_evsel__read.read_counters 0.18 ±141% +0.6 0.74 ± 19% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read.readn.perf_evsel__read 0.18 ±141% +0.6 0.75 ± 19% perf-profile.calltrace.cycles-pp.__libc_read.readn.perf_evsel__read.read_counters.process_interval 0.18 ±141% +0.6 0.76 ± 19% perf-profile.calltrace.cycles-pp.readn.perf_evsel__read.read_counters.process_interval.dispatch_events 0.31 ±103% +0.6 0.89 ± 18% perf-profile.calltrace.cycles-pp.update_sd_lb_stats.find_busiest_group.load_balance.rebalance_domains.__do_softirq 0.10 ±223% +0.6 0.69 ± 18% perf-profile.calltrace.cycles-pp.__sched_setaffinity.sched_setaffinity.__x64_sys_sched_setaffinity.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.71 ± 23% +0.6 1.30 ± 26% perf-profile.calltrace.cycles-pp.seq_read_iter.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.31 ±103% +0.6 0.90 ± 18% perf-profile.calltrace.cycles-pp.find_busiest_group.load_balance.rebalance_domains.__do_softirq.__irq_exit_rcu 0.22 ±142% +0.6 0.81 ± 17% perf-profile.calltrace.cycles-pp.__x64_sys_sched_setaffinity.do_syscall_64.entry_SYSCALL_64_after_hwframe.sched_setaffinity.evlist_cpu_iterator__next 0.57 ± 60% +0.6 1.19 ± 16% perf-profile.calltrace.cycles-pp.__do_sys_newstat.do_syscall_64.entry_SYSCALL_64_after_hwframe.__xstat64 0.58 ± 60% +0.6 1.21 ± 16% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__xstat64 0.58 ± 60% +0.6 1.21 ± 16% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__xstat64 0.22 ±143% +0.6 0.86 ± 18% perf-profile.calltrace.cycles-pp.update_sg_lb_stats.update_sd_lb_stats.find_busiest_group.load_balance.rebalance_domains 0.58 ± 61% +0.6 1.23 ± 16% perf-profile.calltrace.cycles-pp.__xstat64 0.25 ±150% +0.6 0.90 ± 19% perf-profile.calltrace.cycles-pp.do_dentry_open.do_open.path_openat.do_filp_open.do_sys_openat2 0.24 ±142% +0.7 0.90 ± 17% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.sched_setaffinity.evlist_cpu_iterator__next.read_counters 0.24 ±142% +0.7 0.90 ± 18% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.sched_setaffinity.evlist_cpu_iterator__next.read_counters.process_interval 0.21 ±141% +0.7 0.89 ± 19% perf-profile.calltrace.cycles-pp.perf_evsel__read.read_counters.process_interval.dispatch_events.cmd_stat 0.37 ±108% +0.7 1.07 ± 17% perf-profile.calltrace.cycles-pp.evlist__id2evsel.evsel__read_counter.read_counters.process_interval.dispatch_events 0.64 ± 57% +0.7 1.33 ± 20% perf-profile.calltrace.cycles-pp.evlist_cpu_iterator__next.read_counters.process_interval.dispatch_events.cmd_stat 0.10 ±223% +0.7 0.81 ± 27% perf-profile.calltrace.cycles-pp.show_stat.seq_read_iter.vfs_read.ksys_read.do_syscall_64 0.26 ±142% +0.7 1.01 ± 19% perf-profile.calltrace.cycles-pp.sched_setaffinity.evlist_cpu_iterator__next.read_counters.process_interval.dispatch_events 0.51 ± 84% +0.7 1.25 ± 28% perf-profile.calltrace.cycles-pp.do_open.path_openat.do_filp_open.do_sys_openat2.__x64_sys_openat 0.09 ±223% +0.8 0.85 ± 27% perf-profile.calltrace.cycles-pp.vmstat_start.seq_read_iter.proc_reg_read_iter.vfs_read.ksys_read 0.53 ± 53% +0.8 1.30 ± 25% perf-profile.calltrace.cycles-pp.seq_read_iter.proc_reg_read_iter.vfs_read.ksys_read.do_syscall_64 0.53 ± 53% +0.8 1.30 ± 25% perf-profile.calltrace.cycles-pp.proc_reg_read_iter.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.85 ± 20% +0.8 1.64 ± 26% perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt 0.30 ±103% +0.8 1.12 ± 30% perf-profile.calltrace.cycles-pp.worker_thread.kthread.ret_from_fork.ret_from_fork_asm 0.20 ±143% +0.8 1.03 ± 30% perf-profile.calltrace.cycles-pp.process_one_work.worker_thread.kthread.ret_from_fork.ret_from_fork_asm 0.89 ± 23% +0.8 1.72 ± 26% perf-profile.calltrace.cycles-pp.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt 0.66 ± 70% +0.8 1.48 ± 38% perf-profile.calltrace.cycles-pp.copy_process.kernel_clone.__do_sys_clone.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.27 ±155% +0.8 1.12 ± 33% perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read.readn 0.32 ±150% +0.9 1.18 ± 40% perf-profile.calltrace.cycles-pp.dup_mm.copy_process.kernel_clone.__do_sys_clone.do_syscall_64 0.94 ± 23% +0.9 1.83 ± 26% perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt 0.94 ± 23% +0.9 1.83 ± 26% perf-profile.calltrace.cycles-pp.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt 0.15 ±223% +1.0 1.10 ± 44% perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.exit_mmap 0.15 ±223% +1.0 1.12 ± 43% perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.exit_mmap.__mmput 0.15 ±223% +1.0 1.13 ± 43% perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.exit_mmap.__mmput.exit_mm 1.00 ± 51% +1.0 1.99 ± 36% perf-profile.calltrace.cycles-pp.__do_sys_clone.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_fork 1.00 ± 51% +1.0 1.98 ± 36% perf-profile.calltrace.cycles-pp.kernel_clone.__do_sys_clone.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_fork 1.01 ± 51% +1.0 1.99 ± 36% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__libc_fork 1.01 ± 51% +1.0 1.99 ± 36% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_fork 1.06 ± 42% +1.0 2.05 ± 17% perf-profile.calltrace.cycles-pp.evsel__read_counter.read_counters.process_interval.dispatch_events.cmd_stat 0.17 ±223% +1.0 1.22 ± 41% perf-profile.calltrace.cycles-pp.unmap_vmas.exit_mmap.__mmput.exit_mm.do_exit 1.07 ± 54% +1.0 2.12 ± 36% perf-profile.calltrace.cycles-pp.__libc_fork 0.55 ± 75% +1.2 1.74 ± 36% perf-profile.calltrace.cycles-pp.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault 0.86 ± 59% +1.2 2.10 ± 33% perf-profile.calltrace.cycles-pp.exit_mmap.__mmput.exit_mm.do_exit.do_group_exit 0.87 ± 59% +1.2 2.11 ± 33% perf-profile.calltrace.cycles-pp.__mmput.exit_mm.do_exit.do_group_exit.__x64_sys_exit_group 0.95 ± 59% +1.3 2.29 ± 33% perf-profile.calltrace.cycles-pp.exit_mm.do_exit.do_group_exit.__x64_sys_exit_group.do_syscall_64 1.74 ± 46% +1.4 3.17 ± 25% perf-profile.calltrace.cycles-pp.do_sys_openat2.__x64_sys_openat.do_syscall_64.entry_SYSCALL_64_after_hwframe.open64 1.74 ± 46% +1.4 3.17 ± 25% perf-profile.calltrace.cycles-pp.__x64_sys_openat.do_syscall_64.entry_SYSCALL_64_after_hwframe.open64 1.19 ± 60% +1.6 2.78 ± 31% perf-profile.calltrace.cycles-pp.do_exit.do_group_exit.__x64_sys_exit_group.do_syscall_64.entry_SYSCALL_64_after_hwframe 1.19 ± 61% +1.6 2.78 ± 31% perf-profile.calltrace.cycles-pp.__x64_sys_exit_group.do_syscall_64.entry_SYSCALL_64_after_hwframe 1.19 ± 61% +1.6 2.78 ± 31% perf-profile.calltrace.cycles-pp.do_group_exit.__x64_sys_exit_group.do_syscall_64.entry_SYSCALL_64_after_hwframe 1.82 ± 24% +1.6 3.46 ± 23% perf-profile.calltrace.cycles-pp.ret_from_fork_asm 1.82 ± 24% +1.6 3.46 ± 23% perf-profile.calltrace.cycles-pp.ret_from_fork.ret_from_fork_asm 1.82 ± 24% +1.6 3.46 ± 23% perf-profile.calltrace.cycles-pp.kthread.ret_from_fork.ret_from_fork_asm 2.15 ± 21% +2.0 4.20 ± 24% perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt 1.48 ± 80% +3.4 4.89 ± 18% perf-profile.calltrace.cycles-pp.read_counters.process_interval.dispatch_events.cmd_stat 1.54 ± 79% +3.5 5.03 ± 18% perf-profile.calltrace.cycles-pp.dispatch_events.cmd_stat 1.54 ± 79% +3.5 5.03 ± 18% perf-profile.calltrace.cycles-pp.process_interval.dispatch_events.cmd_stat 1.54 ± 79% +3.5 5.04 ± 18% perf-profile.calltrace.cycles-pp.cmd_stat 0.13 ±223% +3.5 3.67 ± 62% perf-profile.calltrace.cycles-pp.copy_page.folio_copy.migrate_folio_extra.move_to_new_folio.migrate_pages_batch 0.14 ±223% +3.6 3.73 ± 62% perf-profile.calltrace.cycles-pp.folio_copy.migrate_folio_extra.move_to_new_folio.migrate_pages_batch.migrate_pages 0.14 ±223% +3.6 3.73 ± 62% perf-profile.calltrace.cycles-pp.move_to_new_folio.migrate_pages_batch.migrate_pages.migrate_misplaced_page.do_huge_pmd_numa_page 0.14 ±223% +3.6 3.73 ± 62% perf-profile.calltrace.cycles-pp.migrate_folio_extra.move_to_new_folio.migrate_pages_batch.migrate_pages.migrate_misplaced_page 0.14 ±223% +3.9 4.00 ± 62% perf-profile.calltrace.cycles-pp.migrate_pages.migrate_misplaced_page.do_huge_pmd_numa_page.__handle_mm_fault.handle_mm_fault 0.14 ±223% +3.9 4.00 ± 62% perf-profile.calltrace.cycles-pp.migrate_pages_batch.migrate_pages.migrate_misplaced_page.do_huge_pmd_numa_page.__handle_mm_fault 0.14 ±223% +3.9 4.00 ± 62% perf-profile.calltrace.cycles-pp.migrate_misplaced_page.do_huge_pmd_numa_page.__handle_mm_fault.handle_mm_fault.do_user_addr_fault 3.90 ± 41% +3.9 7.77 ± 27% perf-profile.calltrace.cycles-pp.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read 3.97 ± 41% +3.9 7.84 ± 27% perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read 0.14 ±223% +3.9 4.06 ± 61% perf-profile.calltrace.cycles-pp.do_huge_pmd_numa_page.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault 4.13 ± 41% +4.0 8.15 ± 27% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.read 4.13 ± 41% +4.0 8.17 ± 27% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.read 4.18 ± 41% +4.1 8.26 ± 27% perf-profile.calltrace.cycles-pp.read 1.80 ± 50% +5.5 7.29 ± 43% perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault 2.02 ± 50% +5.6 7.64 ± 41% perf-profile.calltrace.cycles-pp.do_user_addr_fault.exc_page_fault.asm_exc_page_fault 2.04 ± 50% +5.6 7.66 ± 41% perf-profile.calltrace.cycles-pp.exc_page_fault.asm_exc_page_fault 2.36 ± 33% +5.6 7.99 ± 33% perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault 2.14 ± 50% +5.7 7.84 ± 40% perf-profile.calltrace.cycles-pp.asm_exc_page_fault 69.69 ± 16% -30.0 39.64 ± 40% perf-profile.children.cycles-pp.__cmd_record 6.08 ±101% -5.5 0.62 ±223% perf-profile.children.cycles-pp.perf_session__process_user_event 6.15 ±100% -5.4 0.72 ±190% perf-profile.children.cycles-pp.__ordered_events__flush 5.48 ±101% -4.9 0.56 ±188% perf-profile.children.cycles-pp.perf_session__deliver_event 0.06 ± 29% +0.0 0.11 ± 27% perf-profile.children.cycles-pp.path_init 0.02 ±141% +0.0 0.06 ± 33% perf-profile.children.cycles-pp.cp_new_stat 0.02 ±141% +0.1 0.07 ± 25% perf-profile.children.cycles-pp.ptep_clear_flush 0.02 ±146% +0.1 0.08 ± 34% perf-profile.children.cycles-pp.rcu_nocb_try_bypass 0.08 ± 24% +0.1 0.14 ± 32% perf-profile.children.cycles-pp._raw_spin_lock_irq 0.02 ±141% +0.1 0.08 ± 25% perf-profile.children.cycles-pp.__legitimize_mnt 0.00 +0.1 0.06 ± 16% perf-profile.children.cycles-pp.vm_memory_committed 0.11 ± 26% +0.1 0.17 ± 19% perf-profile.children.cycles-pp.aa_file_perm 0.06 ± 50% +0.1 0.12 ± 38% perf-profile.children.cycles-pp.kcpustat_cpu_fetch 0.02 ±141% +0.1 0.08 ± 40% perf-profile.children.cycles-pp.set_next_entity 0.09 ± 39% +0.1 0.16 ± 28% perf-profile.children.cycles-pp.try_charge_memcg 0.02 ±143% +0.1 0.09 ± 38% perf-profile.children.cycles-pp.__evlist__disable 0.01 ±223% +0.1 0.08 ± 35% perf-profile.children.cycles-pp._IO_setvbuf 0.08 ± 36% +0.1 0.16 ± 29% perf-profile.children.cycles-pp.switch_mm_irqs_off 0.02 ±223% +0.1 0.09 ± 27% perf-profile.children.cycles-pp.drm_gem_vunmap_unlocked 0.12 ± 23% +0.1 0.20 ± 35% perf-profile.children.cycles-pp.get_idle_time 0.01 ±223% +0.1 0.08 ± 19% perf-profile.children.cycles-pp.meminfo_proc_show 0.10 ± 14% +0.1 0.18 ± 33% perf-profile.children.cycles-pp.drm_atomic_helper_commit 0.12 ± 17% +0.1 0.20 ± 32% perf-profile.children.cycles-pp.xas_descend 0.05 ± 77% +0.1 0.13 ± 27% perf-profile.children.cycles-pp.fsnotify_perm 0.02 ±223% +0.1 0.10 ± 42% perf-profile.children.cycles-pp.vm_unmapped_area 0.11 ± 13% +0.1 0.19 ± 33% perf-profile.children.cycles-pp.drm_atomic_commit 0.02 ±143% +0.1 0.11 ± 18% perf-profile.children.cycles-pp.__kmalloc 0.04 ±118% +0.1 0.13 ± 43% perf-profile.children.cycles-pp.xas_find 0.00 +0.1 0.08 ± 30% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath 0.08 ± 33% +0.1 0.17 ± 37% perf-profile.children.cycles-pp.node_read_vmstat 0.09 ± 45% +0.1 0.18 ± 29% perf-profile.children.cycles-pp.select_task_rq 0.01 ±223% +0.1 0.10 ± 36% perf-profile.children.cycles-pp.slab_show 0.03 ±143% +0.1 0.12 ± 46% perf-profile.children.cycles-pp.acpi_ps_parse_loop 0.12 ± 36% +0.1 0.21 ± 33% perf-profile.children.cycles-pp.dequeue_entity 0.01 ±223% +0.1 0.10 ± 27% perf-profile.children.cycles-pp._IO_file_doallocate 0.08 ± 53% +0.1 0.17 ± 22% perf-profile.children.cycles-pp.apparmor_ptrace_access_check 0.04 ±105% +0.1 0.13 ± 48% perf-profile.children.cycles-pp.acpi_ps_parse_aml 0.08 ± 32% +0.1 0.18 ± 34% perf-profile.children.cycles-pp.autoremove_wake_function 0.12 ± 26% +0.1 0.22 ± 31% perf-profile.children.cycles-pp.__x64_sys_close 0.11 ± 16% +0.1 0.21 ± 36% perf-profile.children.cycles-pp.drm_atomic_helper_dirtyfb 0.04 ±107% +0.1 0.13 ± 47% perf-profile.children.cycles-pp.acpi_ns_evaluate 0.04 ±107% +0.1 0.13 ± 47% perf-profile.children.cycles-pp.acpi_ps_execute_method 0.02 ±146% +0.1 0.12 ± 32% perf-profile.children.cycles-pp.thread_group_cputime 0.12 ± 35% +0.1 0.22 ± 21% perf-profile.children.cycles-pp.atime_needs_update 0.09 ± 44% +0.1 0.18 ± 23% perf-profile.children.cycles-pp.update_rq_clock_task 0.11 ± 32% +0.1 0.21 ± 25% perf-profile.children.cycles-pp.__perf_event_read_value 0.04 ±107% +0.1 0.14 ± 45% perf-profile.children.cycles-pp.acpi_os_execute_deferred 0.04 ±107% +0.1 0.14 ± 45% perf-profile.children.cycles-pp.acpi_ev_asynch_execute_gpe_method 0.04 ±112% +0.1 0.14 ± 38% perf-profile.children.cycles-pp.get_unmapped_area 0.06 ± 58% +0.1 0.16 ± 38% perf-profile.children.cycles-pp.prepare_task_switch 0.13 ± 34% +0.1 0.23 ± 19% perf-profile.children.cycles-pp.generic_exec_single 0.10 ± 30% +0.1 0.20 ± 30% perf-profile.children.cycles-pp.__wait_for_common 0.03 ±105% +0.1 0.13 ± 29% perf-profile.children.cycles-pp.thread_group_cputime_adjusted 0.13 ± 32% +0.1 0.24 ± 19% perf-profile.children.cycles-pp.smp_call_function_single 0.12 ± 40% +0.1 0.23 ± 26% perf-profile.children.cycles-pp.ttwu_do_activate 0.06 ± 58% +0.1 0.17 ± 32% perf-profile.children.cycles-pp.kstat_irqs_usr 0.10 ± 31% +0.1 0.22 ± 33% perf-profile.children.cycles-pp.__wake_up_common_lock 0.02 ±223% +0.1 0.13 ± 38% perf-profile.children.cycles-pp.free_unref_page_prepare 0.11 ± 48% +0.1 0.22 ± 37% perf-profile.children.cycles-pp.single_release 0.15 ± 33% +0.1 0.26 ± 17% perf-profile.children.cycles-pp.perf_event_read 0.08 ± 48% +0.1 0.20 ± 27% perf-profile.children.cycles-pp.__do_set_cpus_allowed 0.10 ± 70% +0.1 0.21 ± 33% perf-profile.children.cycles-pp.vm_area_dup 0.09 ± 31% +0.1 0.21 ± 35% perf-profile.children.cycles-pp.__wake_up_common 0.20 ± 37% +0.1 0.32 ± 21% perf-profile.children.cycles-pp.update_load_avg 0.12 ± 35% +0.1 0.24 ± 28% perf-profile.children.cycles-pp.blk_mq_queue_tag_busy_iter 0.12 ± 35% +0.1 0.24 ± 28% perf-profile.children.cycles-pp.blk_mq_in_flight 0.20 ± 28% +0.1 0.32 ± 16% perf-profile.children.cycles-pp.__cond_resched 0.17 ± 37% +0.1 0.30 ± 26% perf-profile.children.cycles-pp.dequeue_task_fair 0.08 ± 51% +0.1 0.21 ± 36% perf-profile.children.cycles-pp.free_swap_cache 0.02 ±146% +0.1 0.16 ± 38% perf-profile.children.cycles-pp.flush_tlb_func 0.09 ± 48% +0.1 0.23 ± 37% perf-profile.children.cycles-pp.free_pages_and_swap_cache 0.18 ± 37% +0.1 0.32 ± 26% perf-profile.children.cycles-pp.update_curr 0.02 ±142% +0.1 0.16 ± 26% perf-profile.children.cycles-pp.__x64_sys_newfstat 0.04 ±109% +0.1 0.18 ± 53% perf-profile.children.cycles-pp.free_unref_page_list 0.12 ± 38% +0.1 0.26 ± 30% perf-profile.children.cycles-pp.restore_fpregs_from_fpstate 0.12 ± 59% +0.1 0.27 ± 26% perf-profile.children.cycles-pp.security_ptrace_access_check 0.13 ± 40% +0.1 0.28 ± 33% perf-profile.children.cycles-pp.user_path_at_empty 0.12 ± 31% +0.1 0.27 ± 23% perf-profile.children.cycles-pp.__set_cpus_allowed_ptr_locked 0.20 ± 31% +0.1 0.35 ± 34% perf-profile.children.cycles-pp.dev_attr_show 0.13 ± 44% +0.2 0.28 ± 31% perf-profile.children.cycles-pp.readlink 0.20 ± 30% +0.2 0.35 ± 26% perf-profile.children.cycles-pp.__memcpy 0.00 +0.2 0.15 ± 64% perf-profile.children.cycles-pp.pmdp_invalidate 0.18 ± 31% +0.2 0.34 ± 27% perf-profile.children.cycles-pp.dup_task_struct 0.13 ± 33% +0.2 0.29 ± 29% perf-profile.children.cycles-pp.switch_fpu_return 0.00 +0.2 0.16 ± 64% perf-profile.children.cycles-pp.set_pmd_migration_entry 0.19 ± 26% +0.2 0.34 ± 27% perf-profile.children.cycles-pp.__entry_text_start 0.12 ± 32% +0.2 0.29 ± 38% perf-profile.children.cycles-pp.pipe_write 0.25 ± 35% +0.2 0.42 ± 32% perf-profile.children.cycles-pp.__check_object_size 0.23 ± 35% +0.2 0.40 ± 20% perf-profile.children.cycles-pp.asm_sysvec_reschedule_ipi 0.22 ± 46% +0.2 0.39 ± 21% perf-profile.children.cycles-pp.enqueue_task_fair 0.00 +0.2 0.18 ± 92% perf-profile.children.cycles-pp.cpuidle_enter 0.00 +0.2 0.18 ± 92% perf-profile.children.cycles-pp.cpuidle_enter_state 0.00 +0.2 0.18 ± 59% perf-profile.children.cycles-pp.try_to_migrate 0.00 +0.2 0.18 ± 59% perf-profile.children.cycles-pp.try_to_migrate_one 0.16 ± 37% +0.2 0.34 ± 33% perf-profile.children.cycles-pp.do_readlinkat 0.19 ± 54% +0.2 0.37 ± 44% perf-profile.children.cycles-pp.rcu_cblist_dequeue 0.16 ± 37% +0.2 0.34 ± 32% perf-profile.children.cycles-pp.__x64_sys_readlink 0.00 +0.2 0.19 ± 60% perf-profile.children.cycles-pp.rmap_walk_anon 0.00 +0.2 0.19 ± 66% perf-profile.children.cycles-pp.__sysvec_call_function 0.00 +0.2 0.19 ± 95% perf-profile.children.cycles-pp.cpuidle_idle_call 0.00 +0.2 0.20 ± 59% perf-profile.children.cycles-pp.migrate_folio_unmap 0.21 ± 43% +0.2 0.42 ± 27% perf-profile.children.cycles-pp.diskstats_show 0.28 ± 36% +0.2 0.49 ± 24% perf-profile.children.cycles-pp.__kmem_cache_alloc_node 0.00 +0.2 0.21 ± 53% perf-profile.children.cycles-pp.__flush_smp_call_function_queue 0.01 ±223% +0.2 0.24 ± 63% perf-profile.children.cycles-pp.sysvec_call_function 0.39 ± 15% +0.2 0.63 ± 15% perf-profile.children.cycles-pp.native_irq_return_iret 0.22 ± 13% +0.3 0.48 ± 28% perf-profile.children.cycles-pp.all_vm_events 0.21 ± 38% +0.3 0.48 ± 40% perf-profile.children.cycles-pp.write 0.30 ± 45% +0.3 0.58 ± 27% perf-profile.children.cycles-pp._raw_spin_lock 0.22 ± 40% +0.3 0.50 ± 26% perf-profile.children.cycles-pp.getname_flags 0.28 ± 54% +0.3 0.56 ± 24% perf-profile.children.cycles-pp.memcg_slab_post_alloc_hook 0.30 ± 55% +0.3 0.60 ± 39% perf-profile.children.cycles-pp.dup_mmap 0.01 ±223% +0.3 0.32 ± 88% perf-profile.children.cycles-pp.start_secondary 0.20 ± 38% +0.3 0.50 ± 40% perf-profile.children.cycles-pp.release_pages 0.01 ±223% +0.3 0.32 ± 58% perf-profile.children.cycles-pp.asm_sysvec_call_function 0.01 ±223% +0.3 0.32 ± 85% perf-profile.children.cycles-pp.do_idle 0.01 ±223% +0.3 0.32 ± 85% perf-profile.children.cycles-pp.secondary_startup_64_no_verify 0.01 ±223% +0.3 0.32 ± 85% perf-profile.children.cycles-pp.cpu_startup_entry 0.37 ± 37% +0.3 0.68 ± 30% perf-profile.children.cycles-pp.__close_nocancel 0.26 ± 24% +0.3 0.58 ± 38% perf-profile.children.cycles-pp.drm_fb_helper_damage_work 0.26 ± 24% +0.3 0.58 ± 38% perf-profile.children.cycles-pp.drm_fbdev_generic_helper_fb_dirty 0.36 ± 37% +0.3 0.69 ± 19% perf-profile.children.cycles-pp.perf_read 0.33 ± 29% +0.3 0.66 ± 30% perf-profile.children.cycles-pp.fold_vm_numa_events 0.28 ± 48% +0.3 0.62 ± 34% perf-profile.children.cycles-pp.kmem_cache_free 0.22 ± 81% +0.4 0.58 ± 55% perf-profile.children.cycles-pp.wait4 0.36 ± 32% +0.4 0.72 ± 25% perf-profile.children.cycles-pp.__set_cpus_allowed_ptr 0.40 ± 52% +0.4 0.78 ± 32% perf-profile.children.cycles-pp.__d_lookup_rcu 0.43 ± 23% +0.4 0.81 ± 27% perf-profile.children.cycles-pp.show_stat 0.42 ± 32% +0.4 0.81 ± 24% perf-profile.children.cycles-pp.__sched_setaffinity 0.24 ± 44% +0.4 0.65 ± 43% perf-profile.children.cycles-pp.tlb_batch_pages_flush 0.02 ±223% +0.4 0.45 ± 48% perf-profile.children.cycles-pp.on_each_cpu_cond_mask 0.53 ± 48% +0.4 0.97 ± 34% perf-profile.children.cycles-pp.open_last_lookups 0.03 ±223% +0.4 0.48 ± 42% perf-profile.children.cycles-pp.smp_call_function_many_cond 0.07 ± 58% +0.5 0.52 ± 39% perf-profile.children.cycles-pp.flush_tlb_mm_range 0.43 ± 37% +0.5 0.89 ± 19% perf-profile.children.cycles-pp.perf_evsel__read 0.39 ± 18% +0.5 0.85 ± 27% perf-profile.children.cycles-pp.vmstat_start 0.16 ± 57% +0.5 0.63 ± 42% perf-profile.children.cycles-pp.pick_next_task_fair 0.49 ± 31% +0.5 0.97 ± 23% perf-profile.children.cycles-pp.__x64_sys_sched_setaffinity 0.04 ±168% +0.5 0.52 ± 51% perf-profile.children.cycles-pp.newidle_balance 0.45 ± 28% +0.5 0.96 ± 32% perf-profile.children.cycles-pp.finish_task_switch 0.61 ± 20% +0.5 1.13 ± 24% perf-profile.children.cycles-pp.rebalance_domains 0.55 ± 47% +0.5 1.08 ± 17% perf-profile.children.cycles-pp.evlist__id2evsel 0.46 ± 50% +0.5 0.98 ± 51% perf-profile.children.cycles-pp.do_vmi_munmap 0.29 ± 45% +0.5 0.82 ± 37% perf-profile.children.cycles-pp.tlb_finish_mmu 0.44 ± 53% +0.5 0.98 ± 32% perf-profile.children.cycles-pp.wp_page_copy 0.59 ± 29% +0.6 1.14 ± 30% perf-profile.children.cycles-pp.__percpu_counter_sum 0.46 ± 28% +0.6 1.03 ± 30% perf-profile.children.cycles-pp.process_one_work 0.54 ± 59% +0.6 1.12 ± 29% perf-profile.children.cycles-pp.kmem_cache_alloc 0.63 ± 32% +0.6 1.21 ± 31% perf-profile.children.cycles-pp.__mmdrop 0.76 ± 41% +0.6 1.36 ± 32% perf-profile.children.cycles-pp.walk_component 0.58 ± 53% +0.6 1.18 ± 40% perf-profile.children.cycles-pp.dup_mm 0.48 ± 30% +0.6 1.12 ± 30% perf-profile.children.cycles-pp.worker_thread 0.30 ± 63% +0.6 0.95 ± 41% perf-profile.children.cycles-pp._compound_head 0.68 ± 36% +0.6 1.32 ± 17% perf-profile.children.cycles-pp.readn 0.61 ± 27% +0.7 1.30 ± 25% perf-profile.children.cycles-pp.proc_reg_read_iter 0.99 ± 41% +0.8 1.75 ± 32% perf-profile.children.cycles-pp.lookup_fast 0.78 ± 31% +0.8 1.55 ± 22% perf-profile.children.cycles-pp.evlist_cpu_iterator__next 1.77 ± 16% +0.8 2.55 ± 21% perf-profile.children.cycles-pp.irqentry_exit_to_user_mode 0.58 ± 24% +0.8 1.42 ± 29% perf-profile.children.cycles-pp.update_sg_lb_stats 0.61 ± 24% +0.9 1.48 ± 29% perf-profile.children.cycles-pp.update_sd_lb_stats 0.61 ± 24% +0.9 1.50 ± 30% perf-profile.children.cycles-pp.find_busiest_group 1.08 ± 29% +0.9 2.00 ± 32% perf-profile.children.cycles-pp.__irq_exit_rcu 0.93 ± 48% +0.9 1.87 ± 35% perf-profile.children.cycles-pp.copy_process 0.65 ± 26% +1.0 1.62 ± 29% perf-profile.children.cycles-pp.load_balance 1.00 ± 51% +1.0 1.99 ± 36% perf-profile.children.cycles-pp.__do_sys_clone 1.06 ± 42% +1.0 2.05 ± 17% perf-profile.children.cycles-pp.evsel__read_counter 0.50 ± 62% +1.0 1.51 ± 47% perf-profile.children.cycles-pp.zap_pte_range 0.51 ± 61% +1.0 1.53 ± 46% perf-profile.children.cycles-pp.zap_pmd_range 0.53 ± 61% +1.0 1.57 ± 46% perf-profile.children.cycles-pp.unmap_page_range 1.05 ± 31% +1.1 2.10 ± 23% perf-profile.children.cycles-pp.sched_setaffinity 1.07 ± 54% +1.1 2.12 ± 36% perf-profile.children.cycles-pp.__libc_fork 1.70 ± 17% +1.1 2.80 ± 29% perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt 0.61 ± 61% +1.1 1.73 ± 43% perf-profile.children.cycles-pp.unmap_vmas 1.32 ± 29% +1.2 2.48 ± 40% perf-profile.children.cycles-pp.__do_softirq 1.30 ± 29% +1.2 2.48 ± 28% perf-profile.children.cycles-pp.do_fault 1.02 ± 32% +1.3 2.29 ± 32% perf-profile.children.cycles-pp.schedule 0.95 ± 59% +1.3 2.30 ± 33% perf-profile.children.cycles-pp.exit_mm 1.15 ± 34% +1.5 2.65 ± 31% perf-profile.children.cycles-pp.__schedule 1.18 ± 58% +1.6 2.76 ± 32% perf-profile.children.cycles-pp.exit_mmap 1.18 ± 58% +1.6 2.78 ± 32% perf-profile.children.cycles-pp.__mmput 1.23 ± 60% +1.6 2.86 ± 31% perf-profile.children.cycles-pp.do_exit 1.23 ± 60% +1.6 2.87 ± 31% perf-profile.children.cycles-pp.do_group_exit 1.23 ± 60% +1.6 2.87 ± 31% perf-profile.children.cycles-pp.__x64_sys_exit_group 1.82 ± 24% +1.6 3.46 ± 23% perf-profile.children.cycles-pp.kthread 1.83 ± 24% +1.7 3.51 ± 24% perf-profile.children.cycles-pp.ret_from_fork_asm 1.83 ± 23% +1.7 3.50 ± 24% perf-profile.children.cycles-pp.ret_from_fork 2.70 ± 16% +1.8 4.51 ± 30% perf-profile.children.cycles-pp.exit_to_user_mode_loop 2.85 ± 17% +2.0 4.83 ± 29% perf-profile.children.cycles-pp.exit_to_user_mode_prepare 3.90 ± 12% +2.0 5.92 ± 23% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt 2.51 ± 39% +2.4 4.89 ± 18% perf-profile.children.cycles-pp.read_counters 2.60 ± 38% +2.4 5.03 ± 18% perf-profile.children.cycles-pp.dispatch_events 2.60 ± 38% +2.4 5.03 ± 18% perf-profile.children.cycles-pp.process_interval 2.60 ± 38% +2.4 5.04 ± 18% perf-profile.children.cycles-pp.cmd_stat 3.87 ± 43% +3.2 7.04 ± 26% perf-profile.children.cycles-pp.seq_read_iter 0.23 ±170% +3.6 3.83 ± 58% perf-profile.children.cycles-pp.folio_copy 0.23 ±169% +3.6 3.84 ± 58% perf-profile.children.cycles-pp.migrate_folio_extra 0.23 ±169% +3.6 3.84 ± 58% perf-profile.children.cycles-pp.move_to_new_folio 0.28 ±145% +3.7 4.00 ± 56% perf-profile.children.cycles-pp.copy_page 0.24 ±171% +3.9 4.14 ± 58% perf-profile.children.cycles-pp.migrate_pages_batch 0.24 ±171% +3.9 4.14 ± 58% perf-profile.children.cycles-pp.migrate_pages 0.25 ±171% +3.9 4.15 ± 58% perf-profile.children.cycles-pp.migrate_misplaced_page 0.22 ±166% +3.9 4.13 ± 58% perf-profile.children.cycles-pp.do_huge_pmd_numa_page 4.19 ± 41% +4.1 8.29 ± 27% perf-profile.children.cycles-pp.read 4.84 ± 41% +4.1 8.96 ± 25% perf-profile.children.cycles-pp.vfs_read 5.01 ± 41% +4.3 9.29 ± 25% perf-profile.children.cycles-pp.ksys_read 3.24 ± 32% +6.3 9.52 ± 30% perf-profile.children.cycles-pp.__handle_mm_fault 3.68 ± 31% +6.5 10.18 ± 28% perf-profile.children.cycles-pp.handle_mm_fault 4.55 ± 27% +6.8 11.34 ± 24% perf-profile.children.cycles-pp.do_user_addr_fault 4.62 ± 27% +6.8 11.43 ± 24% perf-profile.children.cycles-pp.exc_page_fault 5.01 ± 26% +7.0 12.02 ± 23% perf-profile.children.cycles-pp.asm_exc_page_fault 0.02 ±141% +0.1 0.08 ± 22% perf-profile.self.cycles-pp.__legitimize_mnt 0.11 ± 26% +0.1 0.16 ± 19% perf-profile.self.cycles-pp.aa_file_perm 0.02 ±141% +0.1 0.08 ± 24% perf-profile.self.cycles-pp.perf_evsel__read 0.02 ±144% +0.1 0.08 ± 40% perf-profile.self.cycles-pp.check_heap_object 0.07 ± 30% +0.1 0.13 ± 34% perf-profile.self.cycles-pp._raw_spin_lock_irq 0.00 +0.1 0.06 ± 13% perf-profile.self.cycles-pp._copy_to_iter 0.07 ± 52% +0.1 0.13 ± 16% perf-profile.self.cycles-pp.atime_needs_update 0.06 ± 50% +0.1 0.12 ± 38% perf-profile.self.cycles-pp.kcpustat_cpu_fetch 0.01 ±223% +0.1 0.08 ± 33% perf-profile.self.cycles-pp.wq_worker_comm 0.05 ± 80% +0.1 0.13 ± 26% perf-profile.self.cycles-pp.try_charge_memcg 0.07 ± 57% +0.1 0.14 ± 28% perf-profile.self.cycles-pp.switch_mm_irqs_off 0.05 ± 84% +0.1 0.13 ± 26% perf-profile.self.cycles-pp.update_rq_clock_task 0.02 ±223% +0.1 0.09 ± 26% perf-profile.self.cycles-pp.enqueue_task_fair 0.01 ±223% +0.1 0.09 ± 27% perf-profile.self.cycles-pp.thread_group_cputime 0.04 ±104% +0.1 0.12 ± 23% perf-profile.self.cycles-pp.fsnotify_perm 0.05 ± 86% +0.1 0.13 ± 23% perf-profile.self.cycles-pp.perf_read 0.01 ±223% +0.1 0.09 ± 27% perf-profile.self.cycles-pp._IO_file_doallocate 0.12 ± 19% +0.1 0.20 ± 32% perf-profile.self.cycles-pp.xas_descend 0.00 +0.1 0.08 ± 30% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath 0.10 ± 39% +0.1 0.19 ± 26% perf-profile.self.cycles-pp.update_curr 0.03 ±105% +0.1 0.12 ± 39% perf-profile.self.cycles-pp.__fput 0.03 ±150% +0.1 0.13 ± 40% perf-profile.self.cycles-pp.task_dump_owner 0.06 ± 58% +0.1 0.17 ± 33% perf-profile.self.cycles-pp.kstat_irqs_usr 0.02 ±223% +0.1 0.12 ± 35% perf-profile.self.cycles-pp.free_unref_page_prepare 0.08 ± 27% +0.1 0.20 ± 40% perf-profile.self.cycles-pp.release_pages 0.12 ± 37% +0.1 0.23 ± 27% perf-profile.self.cycles-pp.blk_mq_queue_tag_busy_iter 0.17 ± 37% +0.1 0.29 ± 28% perf-profile.self.cycles-pp.__schedule 0.08 ± 51% +0.1 0.20 ± 35% perf-profile.self.cycles-pp.free_swap_cache 0.13 ± 23% +0.1 0.26 ± 18% perf-profile.self.cycles-pp.__entry_text_start 0.13 ± 39% +0.1 0.26 ± 24% perf-profile.self.cycles-pp.evlist_cpu_iterator__next 0.02 ±142% +0.1 0.15 ± 24% perf-profile.self.cycles-pp.__x64_sys_newfstat 0.08 ± 40% +0.1 0.22 ± 17% perf-profile.self.cycles-pp.vfs_read 0.12 ± 38% +0.1 0.26 ± 30% perf-profile.self.cycles-pp.restore_fpregs_from_fpstate 0.20 ± 32% +0.2 0.35 ± 26% perf-profile.self.cycles-pp.__memcpy 0.22 ± 43% +0.2 0.40 ± 22% perf-profile.self.cycles-pp.do_dentry_open 0.16 ± 41% +0.2 0.34 ± 23% perf-profile.self.cycles-pp.__kmem_cache_alloc_node 0.19 ± 54% +0.2 0.37 ± 44% perf-profile.self.cycles-pp.rcu_cblist_dequeue 0.24 ± 52% +0.2 0.43 ± 35% perf-profile.self.cycles-pp.inode_permission 0.20 ± 39% +0.2 0.40 ± 17% perf-profile.self.cycles-pp.evsel__read_counter 0.14 ± 44% +0.2 0.35 ± 18% perf-profile.self.cycles-pp.memcg_slab_post_alloc_hook 0.39 ± 15% +0.2 0.63 ± 15% perf-profile.self.cycles-pp.native_irq_return_iret 0.22 ± 49% +0.2 0.47 ± 31% perf-profile.self.cycles-pp.kmem_cache_free 0.02 ±223% +0.3 0.27 ± 44% perf-profile.self.cycles-pp.smp_call_function_many_cond 0.22 ± 13% +0.3 0.47 ± 28% perf-profile.self.cycles-pp.all_vm_events 0.24 ± 62% +0.3 0.50 ± 34% perf-profile.self.cycles-pp.kmem_cache_alloc 0.29 ± 42% +0.3 0.56 ± 22% perf-profile.self.cycles-pp.read_counters 0.32 ± 28% +0.3 0.65 ± 31% perf-profile.self.cycles-pp.fold_vm_numa_events 0.39 ± 52% +0.4 0.76 ± 32% perf-profile.self.cycles-pp.__d_lookup_rcu 0.37 ± 18% +0.4 0.75 ± 28% perf-profile.self.cycles-pp.asm_sysvec_apic_timer_interrupt 0.54 ± 46% +0.5 1.05 ± 17% perf-profile.self.cycles-pp.evlist__id2evsel 0.58 ± 29% +0.5 1.10 ± 31% perf-profile.self.cycles-pp.__percpu_counter_sum 0.30 ± 63% +0.6 0.92 ± 40% perf-profile.self.cycles-pp._compound_head 0.46 ± 22% +0.6 1.11 ± 28% perf-profile.self.cycles-pp.update_sg_lb_stats 0.27 ±144% +3.7 3.98 ± 57% perf-profile.self.cycles-pp.copy_page Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki