On Tue, 2023-03-21 at 13:43 +0800, Huang, Ying wrote: > "Liu, Yujie" <yujie.liu@xxxxxxxxx> writes: > > > Hi Ying, > > > > On Mon, 2023-03-20 at 15:58 +0800, Huang, Ying wrote: > > > Hi, Yujie, > > > > > > kernel test robot <yujie.liu@xxxxxxxxx> writes: > > > > > > > Hello, > > > > > > > > FYI, we noticed a -3.4% regression of vm-scalability.throughput due to commit: > > > > > > > > commit: 7e12beb8ca2ac98b2ec42e0ea4b76cdc93b58654 ("migrate_pages: batch flushing TLB") > > > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master > > > > > > > > in testcase: vm-scalability > > > > on test machine: 96 threads 2 sockets Intel(R) Xeon(R) Platinum 8260L CPU @ 2.40GHz (Cascade Lake) with 128G memory > > > > with following parameters: > > > > > > > > runtime: 300s > > > > size: 512G > > > > test: anon-cow-rand-mt > > > > cpufreq_governor: performance > > > > > > > > test-description: The motivation behind this suite is to exercise functions and regions of the mm/ of the Linux kernel which are of interest to us. > > > > test-url: https://git.kernel.org/cgit/linux/kernel/git/wfg/vm-scalability.git/ > > > > > > > > > > > > If you fix the issue, kindly add following tag > > > > > Reported-by: kernel test robot <yujie.liu@xxxxxxxxx> > > > > > Link: https://lore.kernel.org/oe-lkp/202303192325.ecbaf968-yujie.liu@xxxxxxxxx > > > > > > > > > > Thanks a lot for report! Can you try whether the debug patch as > > > below can restore the regression? > > > > We've tested the patch and found the throughput score was partially > > restored from -3.6% to -1.4%, still with a slight performance drop. > > Please check the detailed data as follows: > > Good! Thanks for your detailed data! > > > 0.09 ± 17% +1.2 1.32 ± 7% +0.4 0.45 ± 21% perf-profile.children.cycles-pp.flush_tlb_func > > It appears that we can reduce the unnecessary TLB flushing effectively > with the previous debug patch. But the batched flush (full flush) is > still slower than the non-batched flush (flush one page). > > Can you try the debug patch as below to check whether it can restore the > regression completely? The new debug patch can be applied on top of the > previous debug patch. The second debug patch got a -0.7% performance change. The data have some fluctuations from test to test, and the standard deviation is even a bit larger than 0.7%, which make the performance score not very convincing. Please check other metrics to see if the regression is fully restored. Thanks. ========================================================================================= compiler/cpufreq_governor/kconfig/rootfs/runtime/size/tbox_group/test/testcase: gcc-11/performance/x86_64-rhel-8.3/debian-11.1-x86_64-20220510.cgz/300s/512G/lkp-csl-2sp3/anon-cow-rand-mt/vm-scalability commit: ebe75e4751063 ("migrate_pages: share more code between _unmap and _move") 9a30245d65679 ("dbg, rmap: avoid flushing TLB in batch if PTE is inaccessible") a65085664418d ("dbg, migrate_pages: don't batch flushing for single page migration") ebe75e4751063dce 9a30245d656794d171cd798a2be a65085664418d7ed1560095d466 ---------------- --------------------------- --------------------------- %stddev %change %stddev %change %stddev \ | \ | \ 57634 -1.5% 56788 -0.8% 57199 vm-scalability.median 81.16 ± 12% -20.0 61.18 ± 21% -5.0 76.14 ± 12% vm-scalability.stddev% 5528051 -1.4% 5449450 -0.7% 5487122 vm-scalability.throughput 305.38 -0.1% 305.19 -0.1% 305.15 vm-scalability.time.elapsed_time 305.38 -0.1% 305.19 -0.1% 305.15 vm-scalability.time.elapsed_time.max 652.11 ± 88% +54.5% 1007 ± 63% +45.4% 948.20 ± 80% vm-scalability.time.file_system_inputs 200293 ± 3% -4.3% 191707 ± 2% +1.9% 204033 ± 3% vm-scalability.time.involuntary_context_switches 67.11 ± 56% -95.4% 3.11 ± 80% -11.3% 59.50 ± 27% vm-scalability.time.major_page_faults 32930133 -0.0% 32924571 -0.0% 32922758 vm-scalability.time.maximum_resident_set_size 67952989 ± 5% +35.6% 92147668 ± 3% +2.8% 69849921 ± 8% vm-scalability.time.minor_page_faults 4096 +0.0% 4096 +0.0% 4096 vm-scalability.time.page_size 9006 -0.6% 8956 -0.0% 9005 vm-scalability.time.percent_of_cpu_this_job_got 1178 ± 3% +8.6% 1278 ± 3% -1.9% 1155 ± 4% vm-scalability.time.system_time 26327 -1.0% 26056 +0.0% 26327 vm-scalability.time.user_time 11378 ± 5% +118.5% 24867 ± 7% -0.5% 11327 ± 9% vm-scalability.time.voluntary_context_switches 1.662e+09 -1.5% 1.638e+09 -0.8% 1.648e+09 vm-scalability.workload 1.143e+09 +0.6% 1.15e+09 ± 2% +2.9% 1.176e+09 ± 3% cpuidle..time 2464665 ± 3% +2.0% 2515047 ± 4% +2.2% 2519159 ± 8% cpuidle..usage 367.89 -0.2% 367.16 -0.2% 367.32 uptime.boot 6393 ± 3% -0.9% 6336 ± 2% -0.5% 6363 ± 2% uptime.idle 59.33 ± 4% -0.4% 59.06 ± 2% -0.6% 58.94 ± 3% boot-time.boot 33.79 ± 3% -0.8% 33.54 -0.7% 33.57 boot-time.dhcp 5106 ± 4% -0.6% 5076 ± 2% -0.8% 5066 ± 3% boot-time.idle 1.05 ± 8% -4.4% 1.01 -4.3% 1.01 boot-time.smp_boot 3.78 -0.0 3.77 ± 3% +0.1 3.91 ± 4% mpstat.cpu.all.idle% 0.00 ±184% +0.0 0.00 ± 25% -0.0 0.00 ± 60% mpstat.cpu.all.iowait% 2.58 +0.5 3.09 ± 3% -0.0 2.56 mpstat.cpu.all.irq% 0.03 ± 4% +0.0 0.03 ± 8% -0.0 0.03 ± 5% mpstat.cpu.all.soft% 4.06 ± 3% +0.3 4.40 ± 3% -0.1 3.98 ± 4% mpstat.cpu.all.sys% 89.55 -0.8 88.71 -0.0 89.52 mpstat.cpu.all.usr% 0.00 -100.0% 0.00 -100.0% 0.00 numa-numastat.node0.interleave_hit 14350133 ± 4% +7.7% 15454129 ± 4% -0.5% 14283646 ± 4% numa-numastat.node0.local_node 14405409 ± 4% +7.5% 15487972 ± 4% -0.5% 14332762 ± 4% numa-numastat.node0.numa_hit 55258 ± 48% -37.3% 34622 ± 67% -13.6% 47731 ± 51% numa-numastat.node0.other_node 0.00 -100.0% 0.00 -100.0% 0.00 numa-numastat.node1.interleave_hit 14402027 ± 3% +8.4% 15618857 ± 5% -0.1% 14389667 ± 4% numa-numastat.node1.local_node 14433899 ± 3% +8.6% 15670948 ± 5% -0.0% 14429236 ± 4% numa-numastat.node1.numa_hit 31821 ± 84% +64.9% 52467 ± 44% +30.8% 41622 ± 56% numa-numastat.node1.other_node 305.38 -0.1% 305.19 -0.1% 305.15 time.elapsed_time 305.38 -0.1% 305.19 -0.1% 305.15 time.elapsed_time.max 652.11 ± 88% +54.5% 1007 ± 63% +45.4% 948.20 ± 80% time.file_system_inputs 200293 ± 3% -4.3% 191707 ± 2% +1.9% 204033 ± 3% time.involuntary_context_switches 67.11 ± 56% -95.4% 3.11 ± 80% -11.3% 59.50 ± 27% time.major_page_faults 32930133 -0.0% 32924571 -0.0% 32922758 time.maximum_resident_set_size 67952989 ± 5% +35.6% 92147668 ± 3% +2.8% 69849921 ± 8% time.minor_page_faults 4096 +0.0% 4096 +0.0% 4096 time.page_size 9006 -0.6% 8956 -0.0% 9005 time.percent_of_cpu_this_job_got 1178 ± 3% +8.6% 1278 ± 3% -1.9% 1155 ± 4% time.system_time 26327 -1.0% 26056 +0.0% 26327 time.user_time 11378 ± 5% +118.5% 24867 ± 7% -0.5% 11327 ± 9% time.voluntary_context_switches 4.00 +0.0% 4.00 +0.0% 4.00 vmstat.cpu.id 6.00 +16.7% 7.00 +0.0% 6.00 vmstat.cpu.sy 88.33 -0.9% 87.56 +0.3% 88.60 vmstat.cpu.us 0.00 -100.0% 0.00 -100.0% 0.00 vmstat.cpu.wa 10.67 ± 97% -34.4% 7.00 -34.4% 7.00 vmstat.io.bi 8.00 ± 70% -25.0% 6.00 -25.0% 6.00 vmstat.io.bo 1046 -0.1% 1045 -0.1% 1045 vmstat.memory.buff 2964204 -0.1% 2962572 -0.1% 2961826 vmstat.memory.cache 63650311 +0.1% 63687273 +0.1% 63731617 vmstat.memory.free 0.00 -100.0% 0.00 -100.0% 0.00 vmstat.procs.b 92.00 -0.2% 91.78 -0.3% 91.70 vmstat.procs.r 2022 ± 3% +3.6% 2095 -1.3% 1995 vmstat.system.cs 539357 ± 2% +32.9% 716886 ± 4% -2.1% 528047 ± 5% vmstat.system.in 143480 ± 3% -12.0% 126262 ± 4% -0.6% 142665 ± 3% sched_debug.cfs_rq:/.min_vruntime.stddev 548123 ± 7% -20.7% 434543 ± 9% -5.5% 517900 ± 7% sched_debug.cfs_rq:/.spread0.avg 655329 ± 6% -16.2% 549218 ± 6% -4.7% 624275 ± 5% sched_debug.cfs_rq:/.spread0.max 143388 ± 3% -11.9% 126295 ± 4% -0.6% 142588 ± 3% sched_debug.cfs_rq:/.spread0.stddev 240478 ± 6% -12.0% 211715 ± 5% -3.2% 232667 ± 8% sched_debug.cpu.avg_idle.avg 1938 ± 5% +11.4% 2160 ± 3% -2.1% 1897 ± 4% sched_debug.cpu.nr_switches.min 39960890 ± 6% +54.7% 61837739 ± 4% +5.0% 41939453 ± 11% proc-vmstat.numa_hint_faults 19987976 ± 6% +55.1% 30996483 ± 4% +5.0% 20978472 ± 11% proc-vmstat.numa_hint_faults_local 28840932 ± 3% +8.0% 31160418 ± 4% -0.3% 28764186 ± 4% proc-vmstat.numa_hit 28753783 ± 3% +8.1% 31074486 ± 4% -0.3% 28675501 ± 4% proc-vmstat.numa_local 19745743 ± 5% +11.8% 22080123 ± 6% -0.4% 19668879 ± 6% proc-vmstat.numa_pages_migrated 40107839 ± 6% +54.6% 61988683 ± 4% +5.0% 42094380 ± 11% proc-vmstat.numa_pte_updates 37158989 ± 2% +6.3% 39482935 ± 3% -0.2% 37080293 ± 3% proc-vmstat.pgalloc_normal 68856116 ± 5% +35.1% 93057570 ± 3% +2.8% 70755839 ± 8% proc-vmstat.pgfault 19745743 ± 5% +11.8% 22080123 ± 6% -0.4% 19668879 ± 6% proc-vmstat.pgmigrate_success 19754280 ± 5% +11.8% 22080663 ± 6% -0.4% 19677784 ± 6% proc-vmstat.pgreuse 8953845 ± 3% +13.3% 10142474 ± 2% +0.7% 9013008 ± 2% perf-stat.i.branch-misses 158.09 +7.5% 170.00 ± 2% +1.5% 160.38 ± 3% perf-stat.i.cpu-migrations 9.10 -0.1 8.97 -0.0 9.08 perf-stat.i.dTLB-store-miss-rate% 2454429 ± 2% +26.7% 3110501 ± 5% -5.2% 2326293 ± 3% perf-stat.i.iTLB-load-misses 0.31 ± 38% -68.9% 0.10 ± 31% -11.2% 0.27 ± 22% perf-stat.i.major-faults 224958 ± 5% +35.4% 304571 ± 3% +2.7% 231063 ± 8% perf-stat.i.minor-faults 224959 ± 5% +35.4% 304571 ± 3% +2.7% 231064 ± 8% perf-stat.i.page-faults 0.08 ± 4% +0.0 0.09 ± 3% +0.0 0.08 ± 2% perf-stat.overall.branch-miss-rate% 9.38 -0.1 9.25 -0.0 9.37 perf-stat.overall.dTLB-store-miss-rate% 95.49 +1.0 96.53 -0.3 95.15 perf-stat.overall.iTLB-load-miss-rate% 20490 ± 3% -21.5% 16077 ± 6% +4.5% 21404 ± 4% perf-stat.overall.instructions-per-iTLB-miss 8906114 ± 3% +13.3% 10090374 ± 2% +0.7% 8968593 ± 2% perf-stat.ps.branch-misses 157.57 +7.6% 169.49 ± 2% +1.4% 159.76 ± 3% perf-stat.ps.cpu-migrations 2444301 ± 2% +26.8% 3098710 ± 5% -5.2% 2317560 ± 3% perf-stat.ps.iTLB-load-misses 0.31 ± 38% -68.8% 0.10 ± 31% -10.8% 0.27 ± 22% perf-stat.ps.major-faults 224444 ± 5% +35.3% 303619 ± 3% +2.7% 230589 ± 8% perf-stat.ps.minor-faults 224444 ± 5% +35.3% 303620 ± 3% +2.7% 230589 ± 8% perf-stat.ps.page-faults 1.26 ± 15% -1.3 0.00 -0.0 1.25 ± 14% perf-profile.calltrace.cycles-pp.migrate_folio_unmap.migrate_pages_batch.migrate_pages.migrate_misplaced_page.do_numa_page 1.14 ± 15% -1.1 0.00 -0.0 1.12 ± 14% perf-profile.calltrace.cycles-pp.try_to_migrate.migrate_folio_unmap.migrate_pages_batch.migrate_pages.migrate_misplaced_page 1.12 ± 15% -1.1 0.00 -0.0 1.11 ± 14% perf-profile.calltrace.cycles-pp.rmap_walk_anon.try_to_migrate.migrate_folio_unmap.migrate_pages_batch.migrate_pages 1.08 ± 15% -1.1 0.00 -0.0 1.06 ± 14% perf-profile.calltrace.cycles-pp.try_to_migrate_one.rmap_walk_anon.try_to_migrate.migrate_folio_unmap.migrate_pages_batch 0.92 ± 15% -0.9 0.00 -0.0 0.92 ± 14% perf-profile.calltrace.cycles-pp.ptep_clear_flush.try_to_migrate_one.rmap_walk_anon.try_to_migrate.migrate_folio_unmap 0.91 ± 15% -0.9 0.00 -0.0 0.91 ± 14% perf-profile.calltrace.cycles-pp.flush_tlb_mm_range.ptep_clear_flush.try_to_migrate_one.rmap_walk_anon.try_to_migrate 0.91 ± 15% -0.9 0.00 -0.0 0.91 ± 14% perf-profile.calltrace.cycles-pp.on_each_cpu_cond_mask.flush_tlb_mm_range.ptep_clear_flush.try_to_migrate_one.rmap_walk_anon 0.91 ± 15% -0.9 0.00 -0.0 0.90 ± 14% perf-profile.calltrace.cycles-pp.smp_call_function_many_cond.on_each_cpu_cond_mask.flush_tlb_mm_range.ptep_clear_flush.try_to_migrate_one 72.48 ± 3% -0.7 71.79 +2.8 75.24 ± 5% perf-profile.calltrace.cycles-pp.do_access 0.26 ±112% -0.3 0.00 +0.1 0.34 ± 82% perf-profile.calltrace.cycles-pp._raw_spin_lock.do_numa_page.__handle_mm_fault.handle_mm_fault.do_user_addr_fault 0.19 ±141% -0.2 0.00 -0.0 0.16 ±153% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.do_numa_page.__handle_mm_fault.handle_mm_fault 0.07 ±282% -0.1 0.00 -0.1 0.00 perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.nrand48_r 0.07 ±282% -0.1 0.00 -0.1 0.00 perf-profile.calltrace.cycles-pp.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.nrand48_r 0.07 ±282% -0.1 0.00 -0.1 0.00 perf-profile.calltrace.cycles-pp.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.nrand48_r 0.06 ±282% -0.1 0.00 -0.1 0.00 perf-profile.calltrace.cycles-pp.rmap_walk_anon.migrate_pages_batch.migrate_pages.migrate_misplaced_page.do_numa_page 0.13 ±188% -0.0 0.11 ±187% -0.1 0.00 perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.nrand48_r 4.13 ± 3% -0.0 4.12 -0.1 3.98 ± 6% perf-profile.calltrace.cycles-pp.do_rw_once 1.34 ± 39% +0.0 1.35 ± 25% -0.2 1.16 ± 22% perf-profile.calltrace.cycles-pp.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt 0.55 ± 69% +0.0 0.60 ± 56% -0.1 0.50 ± 52% perf-profile.calltrace.cycles-pp.scheduler_tick.update_process_times.tick_sched_handle.tick_sched_timer.__hrtimer_run_queues 1.09 ± 31% +0.1 1.14 ± 26% -0.2 0.93 ± 37% perf-profile.calltrace.cycles-pp.tick_sched_handle.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt 1.08 ± 31% +0.1 1.13 ± 26% -0.2 0.92 ± 37% perf-profile.calltrace.cycles-pp.update_process_times.tick_sched_handle.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt 0.00 +0.1 0.06 ±282% +0.0 0.00 perf-profile.calltrace.cycles-pp.move_to_new_folio.migrate_pages_batch.migrate_pages.migrate_misplaced_page.do_numa_page 0.00 +0.1 0.06 ±282% +0.0 0.00 perf-profile.calltrace.cycles-pp.migrate_folio_extra.move_to_new_folio.migrate_pages_batch.migrate_pages.migrate_misplaced_page 1.18 ± 30% +0.1 1.24 ± 26% -0.1 1.07 ± 23% perf-profile.calltrace.cycles-pp.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt 1.52 ± 28% +0.1 1.58 ± 25% -0.2 1.36 ± 21% perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.do_access 1.43 ± 29% +0.1 1.50 ± 25% -0.1 1.29 ± 21% perf-profile.calltrace.cycles-pp.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.do_access 1.44 ± 28% +0.1 1.51 ± 25% -0.1 1.30 ± 21% perf-profile.calltrace.cycles-pp.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.do_access 1.72 ± 25% +0.1 1.80 ± 22% -0.2 1.55 ± 20% perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.do_access 6.40 ± 9% +0.1 6.54 -0.6 5.76 ± 17% perf-profile.calltrace.cycles-pp.lrand48_r 0.17 ±196% +0.2 0.33 ± 89% -0.1 0.11 ±200% perf-profile.calltrace.cycles-pp.task_tick_fair.scheduler_tick.update_process_times.tick_sched_handle.tick_sched_timer 0.00 +0.3 0.26 ±113% +0.0 0.00 perf-profile.calltrace.cycles-pp.__flush_smp_call_function_queue.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function.do_access 0.00 +0.3 0.26 ±113% +0.0 0.00 perf-profile.calltrace.cycles-pp.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function.do_access 0.00 +0.3 0.33 ± 91% +0.0 0.00 perf-profile.calltrace.cycles-pp.sysvec_call_function.asm_sysvec_call_function.do_access 19.08 ± 10% +0.5 19.59 -2.2 16.90 ± 19% perf-profile.calltrace.cycles-pp.nrand48_r 0.00 +0.6 0.59 ± 40% +0.0 0.00 perf-profile.calltrace.cycles-pp.asm_sysvec_call_function.do_access 3.30 ± 15% +0.9 4.18 ± 19% -0.1 3.24 ± 14% perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault 3.34 ± 15% +0.9 4.22 ± 19% -0.1 3.27 ± 14% perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.do_access 0.00 +0.9 0.90 ± 18% +0.0 0.00 perf-profile.calltrace.cycles-pp.llist_add_batch.smp_call_function_many_cond.on_each_cpu_cond_mask.arch_tlbbatch_flush.try_to_unmap_flush 3.70 ± 15% +0.9 4.64 ± 19% -0.1 3.60 ± 14% perf-profile.calltrace.cycles-pp.exc_page_fault.asm_exc_page_fault.do_access 3.68 ± 15% +0.9 4.63 ± 19% -0.1 3.59 ± 14% perf-profile.calltrace.cycles-pp.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.do_access 3.89 ± 14% +1.0 4.85 ± 19% -0.1 3.76 ± 14% perf-profile.calltrace.cycles-pp.asm_exc_page_fault.do_access 3.03 ± 15% +1.0 4.03 ± 19% -0.1 2.98 ± 14% perf-profile.calltrace.cycles-pp.do_numa_page.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault 2.46 ± 15% +1.4 3.85 ± 19% -0.1 2.41 ± 14% perf-profile.calltrace.cycles-pp.migrate_misplaced_page.do_numa_page.__handle_mm_fault.handle_mm_fault.do_user_addr_fault 2.27 ± 15% +1.4 3.67 ± 19% -0.0 2.22 ± 14% perf-profile.calltrace.cycles-pp.migrate_pages_batch.migrate_pages.migrate_misplaced_page.do_numa_page.__handle_mm_fault 2.27 ± 15% +1.4 3.68 ± 19% -0.0 2.23 ± 14% perf-profile.calltrace.cycles-pp.migrate_pages.migrate_misplaced_page.do_numa_page.__handle_mm_fault.handle_mm_fault 0.00 +2.4 2.38 ± 18% +0.0 0.00 perf-profile.calltrace.cycles-pp.smp_call_function_many_cond.on_each_cpu_cond_mask.arch_tlbbatch_flush.try_to_unmap_flush.migrate_pages_batch 0.00 +2.4 2.40 ± 18% +0.0 0.00 perf-profile.calltrace.cycles-pp.on_each_cpu_cond_mask.arch_tlbbatch_flush.try_to_unmap_flush.migrate_pages_batch.migrate_pages 0.00 +2.4 2.40 ± 18% +0.0 0.00 perf-profile.calltrace.cycles-pp.try_to_unmap_flush.migrate_pages_batch.migrate_pages.migrate_misplaced_page.do_numa_page 0.00 +2.4 2.40 ± 18% +0.0 0.00 perf-profile.calltrace.cycles-pp.arch_tlbbatch_flush.try_to_unmap_flush.migrate_pages_batch.migrate_pages.migrate_misplaced_page 1.51 ± 16% -1.2 0.31 ± 20% -0.0 1.48 ± 14% perf-profile.children.cycles-pp.rmap_walk_anon 1.25 ± 16% -1.0 0.29 ± 20% -0.0 1.22 ± 15% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath 1.08 ± 15% -1.0 0.12 ± 21% -0.0 1.06 ± 14% perf-profile.children.cycles-pp.try_to_migrate_one 1.14 ± 15% -0.9 0.19 ± 19% -0.0 1.12 ± 14% perf-profile.children.cycles-pp.try_to_migrate 0.92 ± 15% -0.9 0.00 -0.0 0.92 ± 14% perf-profile.children.cycles-pp.ptep_clear_flush 1.26 ± 15% -0.9 0.34 ± 21% -0.0 1.25 ± 14% perf-profile.children.cycles-pp.migrate_folio_unmap 0.92 ± 15% -0.9 0.00 -0.0 0.91 ± 14% perf-profile.children.cycles-pp.flush_tlb_mm_range 1.05 ± 15% -0.9 0.16 ± 16% -0.0 1.04 ± 15% perf-profile.children.cycles-pp._raw_spin_lock 72.83 ± 3% -0.6 72.25 +2.8 75.59 ± 5% perf-profile.children.cycles-pp.do_access 0.46 ± 15% -0.3 0.11 ± 20% -0.0 0.44 ± 14% perf-profile.children.cycles-pp.page_vma_mapped_walk 0.34 ± 15% -0.3 0.08 ± 18% -0.0 0.33 ± 15% perf-profile.children.cycles-pp.remove_migration_pte 0.14 ± 16% -0.1 0.00 -0.0 0.14 ± 17% perf-profile.children.cycles-pp.handle_pte_fault 0.13 ± 22% -0.0 0.09 ± 23% -0.0 0.12 ± 17% perf-profile.children.cycles-pp.folio_lruvec_lock_irq 0.13 ± 22% -0.0 0.09 ± 22% -0.0 0.12 ± 18% perf-profile.children.cycles-pp._raw_spin_lock_irq 0.09 ± 39% -0.0 0.07 ± 75% -0.0 0.09 ± 52% perf-profile.children.cycles-pp.cpuacct_account_field 0.17 ± 21% -0.0 0.15 ± 21% -0.0 0.16 ± 15% perf-profile.children.cycles-pp.folio_isolate_lru 0.19 ± 20% -0.0 0.17 ± 20% -0.0 0.18 ± 15% perf-profile.children.cycles-pp.numamigrate_isolate_page 0.12 ± 95% -0.0 0.11 ± 16% -0.1 0.06 ± 13% perf-profile.children.cycles-pp.irqentry_exit_to_user_mode 0.09 ± 47% -0.0 0.08 ± 43% -0.0 0.06 ± 38% perf-profile.children.cycles-pp.hrtimer_active 4.37 ± 3% -0.0 4.36 -0.2 4.22 ± 5% perf-profile.children.cycles-pp.do_rw_once 0.33 ± 2% -0.0 0.32 ± 2% -0.0 0.32 ± 5% perf-profile.children.cycles-pp.lrand48_r@plt 0.01 ±282% -0.0 0.00 -0.0 0.00 perf-profile.children.cycles-pp.enqueue_hrtimer 0.01 ±282% -0.0 0.00 -0.0 0.00 perf-profile.children.cycles-pp.timerqueue_add 0.06 ± 13% -0.0 0.05 ± 37% -0.0 0.04 ± 51% perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe 0.06 ± 13% -0.0 0.05 ± 37% -0.0 0.04 ± 51% perf-profile.children.cycles-pp.do_syscall_64 0.01 ±282% -0.0 0.00 -0.0 0.00 perf-profile.children.cycles-pp.lapic_next_deadline 0.01 ±282% -0.0 0.00 -0.0 0.00 perf-profile.children.cycles-pp.hrtimer_update_next_event 0.01 ±282% -0.0 0.00 -0.0 0.00 perf-profile.children.cycles-pp.update_min_vruntime 0.01 ±282% -0.0 0.00 -0.0 0.00 perf-profile.children.cycles-pp.rcu_core 0.15 ± 20% -0.0 0.15 ± 21% -0.0 0.14 ± 17% perf-profile.children.cycles-pp.folio_lruvec_lock_irqsave 0.07 ± 27% -0.0 0.06 ± 55% -0.0 0.05 ± 53% perf-profile.children.cycles-pp.ktime_get 0.01 ±193% -0.0 0.01 ±188% -0.0 0.01 ±201% perf-profile.children.cycles-pp.perf_mux_hrtimer_handler 0.01 ±282% -0.0 0.01 ±282% -0.0 0.01 ±299% perf-profile.children.cycles-pp.perf_rotate_context 0.21 ± 17% -0.0 0.21 ± 18% -0.0 0.20 ± 15% perf-profile.children.cycles-pp._raw_spin_lock_irqsave 0.02 ±209% -0.0 0.02 ±142% -0.0 0.01 ±300% perf-profile.children.cycles-pp.update_cfs_group 0.05 ± 43% -0.0 0.05 ± 57% -0.0 0.04 ± 67% perf-profile.children.cycles-pp.update_irq_load_avg 0.00 +0.0 0.00 +0.0 0.02 ±300% perf-profile.children.cycles-pp.secondary_startup_64_no_verify 0.00 +0.0 0.00 +0.0 0.02 ±300% perf-profile.children.cycles-pp.start_secondary 0.00 +0.0 0.00 +0.0 0.02 ±300% perf-profile.children.cycles-pp.cpu_startup_entry 0.00 +0.0 0.00 +0.0 0.02 ±300% perf-profile.children.cycles-pp.do_idle 0.00 +0.0 0.00 +0.0 0.02 ±300% perf-profile.children.cycles-pp.cpuidle_idle_call 0.00 +0.0 0.00 +0.0 0.02 ±300% perf-profile.children.cycles-pp.cpuidle_enter 0.00 +0.0 0.00 +0.0 0.02 ±300% perf-profile.children.cycles-pp.cpuidle_enter_state 0.00 +0.0 0.00 +0.0 0.02 ±300% perf-profile.children.cycles-pp.mwait_idle_with_hints 0.00 +0.0 0.00 +0.0 0.02 ±300% perf-profile.children.cycles-pp.intel_idle 0.06 ± 18% +0.0 0.07 ± 41% -0.0 0.05 ± 66% perf-profile.children.cycles-pp.rcu_pending 0.02 ±112% +0.0 0.03 ±111% -0.0 0.01 ±300% perf-profile.children.cycles-pp.timerqueue_del 0.02 ±111% +0.0 0.03 ±112% +0.0 0.03 ±100% perf-profile.children.cycles-pp.irqtime_account_process_tick 0.06 ± 18% +0.0 0.06 ± 19% -0.0 0.03 ± 82% perf-profile.children.cycles-pp.mt_find 0.07 ± 39% +0.0 0.07 ± 28% -0.0 0.05 ± 55% perf-profile.children.cycles-pp.ktime_get_update_offsets_now 0.00 +0.0 0.01 ±282% +0.0 0.00 perf-profile.children.cycles-pp._find_next_bit 0.00 +0.0 0.01 ±282% +0.0 0.00 perf-profile.children.cycles-pp.folio_get_anon_vma 0.00 +0.0 0.01 ±282% +0.0 0.00 perf-profile.children.cycles-pp.__free_one_page 0.06 ± 18% +0.0 0.06 ± 20% -0.0 0.03 ± 82% perf-profile.children.cycles-pp.find_vma 0.11 ± 25% +0.0 0.11 ± 25% -0.0 0.09 ± 38% perf-profile.children.cycles-pp.update_rq_clock 0.32 ± 19% +0.0 0.33 ± 32% -0.0 0.30 ± 31% perf-profile.children.cycles-pp.account_user_time 0.21 ± 48% +0.0 0.22 ± 28% -0.0 0.18 ± 23% perf-profile.children.cycles-pp.update_load_avg 0.09 ± 20% +0.0 0.09 ± 23% -0.0 0.08 ± 38% perf-profile.children.cycles-pp.tick_sched_do_timer 0.02 ±154% +0.0 0.03 ± 92% -0.0 0.02 ±155% perf-profile.children.cycles-pp.__do_softirq 0.07 ± 35% +0.0 0.08 ± 26% -0.0 0.07 ± 20% perf-profile.children.cycles-pp.clockevents_program_event 0.08 ± 36% +0.0 0.09 ± 24% -0.0 0.07 ± 19% perf-profile.children.cycles-pp.__irq_exit_rcu 0.03 ±127% +0.0 0.04 ± 72% -0.0 0.02 ±123% perf-profile.children.cycles-pp.__update_load_avg_cfs_rq 0.08 ± 18% +0.0 0.09 ± 26% -0.0 0.06 ± 53% perf-profile.children.cycles-pp.rcu_sched_clock_irq 0.00 +0.0 0.01 ±187% +0.0 0.00 perf-profile.children.cycles-pp.lru_add_fn 0.21 ± 19% +0.0 0.22 ± 21% -0.0 0.20 ± 15% perf-profile.children.cycles-pp.folio_batch_move_lru 0.21 ± 19% +0.0 0.22 ± 20% -0.0 0.20 ± 15% perf-profile.children.cycles-pp.lru_add_drain 0.21 ± 19% +0.0 0.22 ± 20% -0.0 0.20 ± 15% perf-profile.children.cycles-pp.lru_add_drain_cpu 0.06 ± 39% +0.0 0.07 ± 21% +0.0 0.06 ± 15% perf-profile.children.cycles-pp.rmqueue_bulk 0.06 ± 16% +0.0 0.08 ± 21% -0.0 0.06 ± 13% perf-profile.children.cycles-pp.free_unref_page 0.09 ± 16% +0.0 0.11 ± 22% -0.0 0.09 ± 14% perf-profile.children.cycles-pp.__alloc_pages 0.09 ± 15% +0.0 0.11 ± 21% -0.0 0.09 ± 17% perf-profile.children.cycles-pp.rmqueue 0.09 ± 16% +0.0 0.11 ± 21% -0.0 0.09 ± 14% perf-profile.children.cycles-pp.get_page_from_freelist 0.03 ± 71% +0.0 0.05 ± 39% -0.0 0.03 ± 82% perf-profile.children.cycles-pp.free_pcppages_bulk 0.00 +0.0 0.02 ±142% +0.0 0.00 perf-profile.children.cycles-pp.can_change_pte_writable 0.00 +0.0 0.02 ±142% +0.0 0.00 perf-profile.children.cycles-pp.folio_migrate_flags 0.03 ±152% +0.0 0.04 ± 72% +0.0 0.03 ± 84% perf-profile.children.cycles-pp.__update_load_avg_se 0.09 ± 18% +0.0 0.11 ± 22% -0.0 0.09 ± 14% perf-profile.children.cycles-pp.__folio_alloc 0.09 ± 18% +0.0 0.11 ± 22% +0.0 0.09 ± 16% perf-profile.children.cycles-pp.alloc_misplaced_dst_page 0.08 ± 15% +0.0 0.10 ± 21% -0.0 0.08 ± 16% perf-profile.children.cycles-pp.__list_del_entry_valid 0.04 ± 91% +0.0 0.06 ± 38% +0.0 0.04 ± 66% perf-profile.children.cycles-pp.arch_scale_freq_tick 0.11 ± 16% +0.0 0.13 ± 29% -0.0 0.10 ± 28% perf-profile.children.cycles-pp.__cgroup_account_cputime_field 0.19 ± 17% +0.0 0.21 ± 18% -0.0 0.18 ± 18% perf-profile.children.cycles-pp.down_read_trylock 0.03 ±118% +0.0 0.05 ± 59% -0.0 0.03 ±101% perf-profile.children.cycles-pp.perf_trace_sched_stat_runtime 0.02 ±142% +0.0 0.04 ± 72% -0.0 0.01 ±299% perf-profile.children.cycles-pp.irqtime_account_irq 0.25 ± 39% +0.0 0.27 ± 25% -0.0 0.22 ± 22% perf-profile.children.cycles-pp.update_curr 0.09 ± 7% +0.0 0.11 ± 14% -0.0 0.08 ± 15% perf-profile.children.cycles-pp.sync_regs 0.16 ± 13% +0.0 0.18 ± 19% -0.0 0.15 ± 14% perf-profile.children.cycles-pp.up_read 0.68 ± 45% +0.0 0.71 ± 28% -0.1 0.58 ± 24% perf-profile.children.cycles-pp.task_tick_fair 0.02 ±141% +0.0 0.05 ± 42% +0.0 0.02 ±122% perf-profile.children.cycles-pp.uncharge_batch 0.01 ±282% +0.0 0.04 ± 75% +0.0 0.01 ±200% perf-profile.children.cycles-pp.page_counter_uncharge 0.02 ±141% +0.0 0.06 ± 44% +0.0 0.02 ±100% perf-profile.children.cycles-pp.__mem_cgroup_uncharge 0.02 ±141% +0.0 0.06 ± 44% +0.0 0.02 ±100% perf-profile.children.cycles-pp.__folio_put 0.03 ± 71% +0.0 0.08 ± 25% -0.0 0.03 ± 82% perf-profile.children.cycles-pp.mem_cgroup_migrate 0.96 ± 40% +0.0 1.00 ± 27% -0.1 0.81 ± 24% perf-profile.children.cycles-pp.scheduler_tick 0.11 ± 20% +0.0 0.16 ± 15% -0.0 0.11 ± 11% perf-profile.children.cycles-pp.native_irq_return_iret 0.06 ± 13% +0.1 0.11 ± 16% -0.0 0.06 ± 13% perf-profile.children.cycles-pp.exit_to_user_mode_prepare 0.05 ± 36% +0.1 0.10 ± 18% -0.0 0.04 ± 51% perf-profile.children.cycles-pp.exit_to_user_mode_loop 0.01 ±187% +0.1 0.07 ± 26% +0.0 0.02 ±122% perf-profile.children.cycles-pp.page_counter_charge 0.04 ± 71% +0.1 0.10 ± 18% +0.0 0.04 ± 51% perf-profile.children.cycles-pp.task_work_run 0.17 ± 14% +0.1 0.23 ± 20% -0.0 0.17 ± 15% perf-profile.children.cycles-pp.copy_page 0.17 ± 13% +0.1 0.24 ± 19% -0.0 0.17 ± 15% perf-profile.children.cycles-pp.folio_copy 0.03 ± 90% +0.1 0.10 ± 16% +0.0 0.04 ± 51% perf-profile.children.cycles-pp.change_pte_range 0.03 ± 90% +0.1 0.10 ± 18% +0.0 0.04 ± 51% perf-profile.children.cycles-pp.task_numa_work 0.03 ± 90% +0.1 0.10 ± 18% +0.0 0.04 ± 51% perf-profile.children.cycles-pp.change_prot_numa 0.03 ± 90% +0.1 0.10 ± 18% +0.0 0.04 ± 51% perf-profile.children.cycles-pp.change_protection_range 0.03 ± 90% +0.1 0.10 ± 18% +0.0 0.04 ± 51% perf-profile.children.cycles-pp.change_pmd_range 0.06 ± 40% +0.1 0.13 ± 23% +0.0 0.06 ± 15% perf-profile.children.cycles-pp.__default_send_IPI_dest_field 1.58 ± 32% +0.1 1.65 ± 25% -0.2 1.36 ± 25% perf-profile.children.cycles-pp.tick_sched_handle 1.56 ± 32% +0.1 1.64 ± 25% -0.2 1.35 ± 25% perf-profile.children.cycles-pp.update_process_times 1.85 ± 30% +0.1 1.94 ± 25% -0.2 1.61 ± 24% perf-profile.children.cycles-pp.__hrtimer_run_queues 1.71 ± 31% +0.1 1.79 ± 25% -0.2 1.49 ± 25% perf-profile.children.cycles-pp.tick_sched_timer 0.08 ± 16% +0.1 0.17 ± 21% -0.0 0.08 ± 17% perf-profile.children.cycles-pp.default_send_IPI_mask_sequence_phys 2.09 ± 29% +0.1 2.18 ± 24% -0.3 1.81 ± 23% perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt 2.06 ± 29% +0.1 2.16 ± 24% -0.3 1.79 ± 23% perf-profile.children.cycles-pp.hrtimer_interrupt 2.19 ± 29% +0.1 2.29 ± 24% -0.3 1.89 ± 23% perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt 0.25 ± 12% +0.1 0.36 ± 20% -0.0 0.25 ± 14% perf-profile.children.cycles-pp.move_to_new_folio 0.25 ± 12% +0.1 0.36 ± 20% -0.0 0.25 ± 14% perf-profile.children.cycles-pp.migrate_folio_extra 0.00 +0.1 0.12 ± 22% +0.0 0.00 perf-profile.children.cycles-pp.native_flush_tlb_local 2.48 ± 26% +0.1 2.60 ± 22% -0.3 2.14 ± 22% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt 9.27 ± 8% +0.2 9.45 -0.9 8.41 ± 15% perf-profile.children.cycles-pp.lrand48_r 0.25 ± 14% +0.3 0.55 ± 18% -0.0 0.24 ± 15% perf-profile.children.cycles-pp.llist_reverse_order 0.09 ± 17% +0.4 0.45 ± 21% +0.0 0.09 ± 12% perf-profile.children.cycles-pp.flush_tlb_func 16.69 ± 10% +0.5 17.16 -2.0 14.72 ± 19% perf-profile.children.cycles-pp.nrand48_r 0.40 ± 15% +0.5 0.93 ± 18% -0.0 0.39 ± 14% perf-profile.children.cycles-pp.llist_add_batch 0.41 ± 14% +0.7 1.14 ± 19% -0.0 0.41 ± 14% perf-profile.children.cycles-pp.__sysvec_call_function 0.41 ± 14% +0.7 1.14 ± 19% -0.0 0.41 ± 14% perf-profile.children.cycles-pp.__flush_smp_call_function_queue 0.43 ± 14% +0.7 1.17 ± 19% -0.0 0.42 ± 14% perf-profile.children.cycles-pp.sysvec_call_function 0.55 ± 12% +0.9 1.40 ± 19% -0.0 0.53 ± 15% perf-profile.children.cycles-pp.asm_sysvec_call_function 3.31 ± 15% +0.9 4.19 ± 19% -0.1 3.24 ± 14% perf-profile.children.cycles-pp.__handle_mm_fault 3.34 ± 15% +0.9 4.23 ± 19% -0.1 3.27 ± 14% perf-profile.children.cycles-pp.handle_mm_fault 3.70 ± 15% +0.9 4.64 ± 19% -0.1 3.60 ± 14% perf-profile.children.cycles-pp.exc_page_fault 3.70 ± 15% +0.9 4.64 ± 19% -0.1 3.60 ± 14% perf-profile.children.cycles-pp.do_user_addr_fault 3.91 ± 14% +1.0 4.88 ± 19% -0.1 3.78 ± 14% perf-profile.children.cycles-pp.asm_exc_page_fault 3.03 ± 15% +1.0 4.03 ± 19% -0.1 2.98 ± 14% perf-profile.children.cycles-pp.do_numa_page 2.46 ± 15% +1.4 3.85 ± 19% -0.1 2.41 ± 14% perf-profile.children.cycles-pp.migrate_misplaced_page 2.27 ± 15% +1.4 3.67 ± 19% -0.0 2.22 ± 14% perf-profile.children.cycles-pp.migrate_pages_batch 2.27 ± 15% +1.4 3.68 ± 19% -0.0 2.23 ± 14% perf-profile.children.cycles-pp.migrate_pages 0.91 ± 15% +1.5 2.42 ± 18% -0.0 0.91 ± 14% perf-profile.children.cycles-pp.smp_call_function_many_cond 0.91 ± 15% +1.5 2.42 ± 18% -0.0 0.91 ± 14% perf-profile.children.cycles-pp.on_each_cpu_cond_mask 0.00 +2.4 2.40 ± 18% +0.0 0.00 perf-profile.children.cycles-pp.try_to_unmap_flush 0.00 +2.4 2.40 ± 18% +0.0 0.00 perf-profile.children.cycles-pp.arch_tlbbatch_flush 66.95 ± 3% -2.0 64.95 +3.1 70.02 ± 6% perf-profile.self.cycles-pp.do_access 1.14 ± 16% -0.9 0.28 ± 21% -0.0 1.12 ± 15% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath 0.06 ±187% -0.1 0.00 -0.1 0.00 perf-profile.self.cycles-pp.irqentry_exit_to_user_mode 4.08 ± 3% -0.0 4.03 -0.1 3.94 ± 5% perf-profile.self.cycles-pp.do_rw_once 0.09 ± 39% -0.0 0.07 ± 75% -0.0 0.09 ± 52% perf-profile.self.cycles-pp.cpuacct_account_field 0.06 ± 14% -0.0 0.04 ± 72% -0.0 0.03 ± 82% perf-profile.self.cycles-pp.mt_find 0.01 ±282% -0.0 0.00 -0.0 0.00 perf-profile.self.cycles-pp.lapic_next_deadline 0.01 ±188% -0.0 0.01 ±282% -0.0 0.01 ±200% perf-profile.self.cycles-pp.rmap_walk_anon 0.01 ±282% -0.0 0.00 -0.0 0.00 perf-profile.self.cycles-pp.update_min_vruntime 0.08 ± 47% -0.0 0.07 ± 45% -0.0 0.06 ± 38% perf-profile.self.cycles-pp.hrtimer_active 0.29 ± 4% -0.0 0.28 ± 2% -0.0 0.28 ± 6% perf-profile.self.cycles-pp.lrand48_r@plt 0.06 ± 49% -0.0 0.06 ± 56% -0.0 0.05 ± 52% perf-profile.self.cycles-pp.scheduler_tick 0.02 ±209% -0.0 0.02 ±142% -0.0 0.01 ±300% perf-profile.self.cycles-pp.update_cfs_group 0.05 ± 43% -0.0 0.05 ± 57% -0.0 0.04 ± 67% perf-profile.self.cycles-pp.update_irq_load_avg 0.11 ± 49% -0.0 0.11 ± 29% -0.0 0.09 ± 23% perf-profile.self.cycles-pp.update_load_avg 0.09 ± 41% +0.0 0.09 ± 42% -0.0 0.08 ± 24% perf-profile.self.cycles-pp.task_tick_fair 0.00 +0.0 0.00 +0.0 0.02 ±300% perf-profile.self.cycles-pp.mwait_idle_with_hints 0.12 ± 27% +0.0 0.13 ± 36% -0.0 0.10 ± 45% perf-profile.self.cycles-pp.account_user_time 0.11 ± 17% +0.0 0.12 ± 20% -0.0 0.10 ± 15% perf-profile.self.cycles-pp.__handle_mm_fault 0.02 ±111% +0.0 0.03 ±112% +0.0 0.03 ±100% perf-profile.self.cycles-pp.irqtime_account_process_tick 0.06 ± 55% +0.0 0.06 ± 42% -0.0 0.04 ± 84% perf-profile.self.cycles-pp.ktime_get_update_offsets_now 0.08 ± 17% +0.0 0.08 ± 21% -0.0 0.07 ± 15% perf-profile.self.cycles-pp.page_vma_mapped_walk 0.00 +0.0 0.01 ±282% +0.0 0.00 perf-profile.self.cycles-pp.__free_one_page 0.02 ±141% +0.0 0.02 ±112% -0.0 0.01 ±300% perf-profile.self.cycles-pp.hrtimer_interrupt 0.06 ± 42% +0.0 0.07 ± 43% -0.0 0.06 ± 37% perf-profile.self.cycles-pp.update_process_times 0.07 ± 16% +0.0 0.08 ± 25% -0.0 0.07 ± 38% perf-profile.self.cycles-pp.tick_sched_do_timer 0.00 +0.0 0.01 ±187% +0.0 0.00 perf-profile.self.cycles-pp.can_change_pte_writable 0.00 +0.0 0.01 ±187% +0.0 0.00 perf-profile.self.cycles-pp.folio_migrate_flags 0.00 +0.0 0.01 ±188% +0.0 0.00 perf-profile.self.cycles-pp.try_to_migrate_one 0.02 ±191% +0.0 0.03 ± 90% -0.0 0.01 ±200% perf-profile.self.cycles-pp.__update_load_avg_se 0.19 ± 16% +0.0 0.20 ± 19% -0.0 0.17 ± 17% perf-profile.self.cycles-pp.down_read_trylock 0.03 ±113% +0.0 0.04 ± 71% -0.0 0.03 ±100% perf-profile.self.cycles-pp.update_rq_clock 0.09 ± 14% +0.0 0.10 ± 16% -0.0 0.08 ± 17% perf-profile.self.cycles-pp._raw_spin_lock 0.03 ±151% +0.0 0.04 ± 72% -0.0 0.02 ±123% perf-profile.self.cycles-pp.__update_load_avg_cfs_rq 0.01 ±282% +0.0 0.02 ±112% -0.0 0.00 perf-profile.self.cycles-pp._raw_spin_lock_irqsave 0.01 ±282% +0.0 0.02 ±112% -0.0 0.00 perf-profile.self.cycles-pp.rcu_pending 0.10 ± 16% +0.0 0.12 ± 29% -0.0 0.10 ± 26% perf-profile.self.cycles-pp.__cgroup_account_cputime_field 0.01 ±282% +0.0 0.02 ±112% -0.0 0.01 ±300% perf-profile.self.cycles-pp.__hrtimer_run_queues 0.16 ± 41% +0.0 0.18 ± 25% -0.0 0.14 ± 24% perf-profile.self.cycles-pp.update_curr 0.15 ± 14% +0.0 0.17 ± 21% -0.0 0.14 ± 15% perf-profile.self.cycles-pp.up_read 0.04 ± 94% +0.0 0.05 ± 56% -0.0 0.03 ±100% perf-profile.self.cycles-pp.ktime_get 0.07 ± 16% +0.0 0.10 ± 21% +0.0 0.08 ± 16% perf-profile.self.cycles-pp.__list_del_entry_valid 0.04 ± 91% +0.0 0.06 ± 38% +0.0 0.04 ± 66% perf-profile.self.cycles-pp.arch_scale_freq_tick 0.03 ±118% +0.0 0.05 ± 59% -0.0 0.03 ±101% perf-profile.self.cycles-pp.perf_trace_sched_stat_runtime 0.00 +0.0 0.03 ±113% +0.0 0.00 perf-profile.self.cycles-pp.page_counter_uncharge 0.09 ± 7% +0.0 0.11 ± 14% -0.0 0.08 ± 15% perf-profile.self.cycles-pp.sync_regs 0.02 ±111% +0.0 0.06 ± 15% -0.0 0.02 ±152% perf-profile.self.cycles-pp.change_pte_range 0.11 ± 20% +0.0 0.15 ± 15% -0.0 0.11 ± 11% perf-profile.self.cycles-pp.native_irq_return_iret 0.01 ±282% +0.1 0.06 ± 43% -0.0 0.01 ±299% perf-profile.self.cycles-pp.page_counter_charge 0.16 ± 15% +0.1 0.22 ± 21% -0.0 0.16 ± 16% perf-profile.self.cycles-pp.copy_page 0.06 ± 40% +0.1 0.13 ± 23% +0.0 0.06 ± 15% perf-profile.self.cycles-pp.__default_send_IPI_dest_field 0.07 ± 15% +0.1 0.16 ± 18% +0.0 0.07 ± 15% perf-profile.self.cycles-pp.__flush_smp_call_function_queue 0.00 +0.1 0.11 ± 19% +0.0 0.00 perf-profile.self.cycles-pp.native_flush_tlb_local 8.81 ± 9% +0.1 8.94 ± 2% -0.8 7.99 ± 16% perf-profile.self.cycles-pp.lrand48_r 0.06 ± 16% +0.3 0.33 ± 21% -0.0 0.06 ± 36% perf-profile.self.cycles-pp.flush_tlb_func 0.25 ± 14% +0.3 0.55 ± 18% -0.0 0.24 ± 15% perf-profile.self.cycles-pp.llist_reverse_order 13.38 ± 11% +0.3 13.71 -1.7 11.73 ± 21% perf-profile.self.cycles-pp.nrand48_r 0.35 ± 15% +0.4 0.76 ± 18% -0.0 0.34 ± 13% perf-profile.self.cycles-pp.llist_add_batch 0.37 ± 17% +0.7 1.10 ± 18% +0.0 0.38 ± 15% perf-profile.self.cycles-pp.smp_call_function_many_cond -- Best Regards, Yujie > Best Regards, > Huang, Ying > > ---------------------------8<----------------------------------------- > From b36b662c80652447d7374faff1142a941dc9d617 Mon Sep 17 00:00:00 2001 > From: Huang Ying <ying.huang@xxxxxxxxx> > Date: Mon, 20 Mar 2023 15:38:12 +0800 > Subject: [PATCH] dbg, migrate_pages: don't batch flushing for single page > migration > > --- > mm/migrate.c | 12 +++++++----- > 1 file changed, 7 insertions(+), 5 deletions(-) > > diff --git a/mm/migrate.c b/mm/migrate.c > index 98f1c11197a8..7271209c1a03 100644 > --- a/mm/migrate.c > +++ b/mm/migrate.c > @@ -1113,8 +1113,8 @@ static void migrate_folio_done(struct folio *src, > static int migrate_folio_unmap(new_page_t get_new_page, free_page_t put_new_page, > unsigned long private, struct folio *src, > struct folio **dstp, int force, bool avoid_force_lock, > - enum migrate_mode mode, enum migrate_reason reason, > - struct list_head *ret) > + bool batch_flush, enum migrate_mode mode, > + enum migrate_reason reason, struct list_head *ret) > { > struct folio *dst; > int rc = -EAGAIN; > @@ -1253,7 +1253,7 @@ static int migrate_folio_unmap(new_page_t get_new_page, free_page_t put_new_page > /* Establish migration ptes */ > VM_BUG_ON_FOLIO(folio_test_anon(src) && > !folio_test_ksm(src) && !anon_vma, src); > - try_to_migrate(src, TTU_BATCH_FLUSH); > + try_to_migrate(src, batch_flush ? TTU_BATCH_FLUSH : 0); > page_was_mapped = 1; > } > > @@ -1641,6 +1641,7 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page, > bool nosplit = (reason == MR_NUMA_MISPLACED); > bool no_split_folio_counting = false; > bool avoid_force_lock; > + bool batch_flush = !list_is_singular(from); > > retry: > rc_saved = 0; > @@ -1690,7 +1691,7 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page, > > rc = migrate_folio_unmap(get_new_page, put_new_page, private, > folio, &dst, pass > 2, avoid_force_lock, > - mode, reason, ret_folios); > + batch_flush, mode, reason, ret_folios); > /* > * The rules are: > * Success: folio will be freed > @@ -1804,7 +1805,8 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page, > stats->nr_failed_pages += nr_retry_pages; > move: > /* Flush TLBs for all unmapped folios */ > - try_to_unmap_flush(); > + if (batch_flush) > + try_to_unmap_flush(); > > retry = 1; > for (pass = 0;