Hi Ying, On Mon, 2023-03-20 at 15:58 +0800, Huang, Ying wrote: > Hi, Yujie, > > kernel test robot <yujie.liu@xxxxxxxxx> writes: > > > Hello, > > > > FYI, we noticed a -3.4% regression of vm-scalability.throughput due to commit: > > > > commit: 7e12beb8ca2ac98b2ec42e0ea4b76cdc93b58654 ("migrate_pages: batch flushing TLB") > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master > > > > in testcase: vm-scalability > > on test machine: 96 threads 2 sockets Intel(R) Xeon(R) Platinum 8260L CPU @ 2.40GHz (Cascade Lake) with 128G memory > > with following parameters: > > > > runtime: 300s > > size: 512G > > test: anon-cow-rand-mt > > cpufreq_governor: performance > > > > test-description: The motivation behind this suite is to exercise functions and regions of the mm/ of the Linux kernel which are of interest to us. > > test-url: https://git.kernel.org/cgit/linux/kernel/git/wfg/vm-scalability.git/ > > > > > > If you fix the issue, kindly add following tag > > > Reported-by: kernel test robot <yujie.liu@xxxxxxxxx> > > > Link: https://lore.kernel.org/oe-lkp/202303192325.ecbaf968-yujie.liu@xxxxxxxxx > > > > Thanks a lot for report! Can you try whether the debug patch as > below can restore the regression? We've tested the patch and found the throughput score was partially restored from -3.6% to -1.4%, still with a slight performance drop. Please check the detailed data as follows: ========================================================================================= compiler/cpufreq_governor/kconfig/rootfs/runtime/size/tbox_group/test/testcase: gcc-11/performance/x86_64-rhel-8.3/debian-11.1-x86_64-20220510.cgz/300s/512G/lkp-csl-2sp3/anon-cow-rand-mt/vm-scalability commit: ebe75e4751063 ("migrate_pages: share more code between _unmap and _move") 7e12beb8ca2ac ("migrate_pages: batch flushing TLB") 9a30245d65679 ("dbg, rmap: avoid flushing TLB in batch if PTE is inaccessible") ebe75e4751063dce 7e12beb8ca2ac98b2ec42e0ea4b 9a30245d656794d171cd798a2be ---------------- --------------------------- --------------------------- %stddev %change %stddev %change %stddev \ | \ | \ 57634 -3.5% 55603 -1.5% 56788 vm-scalability.median 81.16 ± 12% -5.0 76.17 ± 35% -20.0 61.18 ± 21% vm-scalability.stddev% 5528051 -3.6% 5328506 -1.4% 5449450 vm-scalability.throughput 200293 ± 3% -7.3% 185675 ± 2% -4.3% 191707 ± 2% vm-scalability.time.involuntary_context_switches 67952989 ± 5% +43.1% 97269013 ± 2% +35.6% 92147668 ± 3% vm-scalability.time.minor_page_faults 9006 -1.8% 8844 -0.6% 8956 vm-scalability.time.percent_of_cpu_this_job_got 1178 ± 3% +57.2% 1852 ± 3% +8.6% 1278 ± 3% vm-scalability.time.system_time 26327 -4.5% 25132 -1.0% 26056 vm-scalability.time.user_time 11378 ± 5% +359.9% 52332 ± 7% +118.5% 24867 ± 7% vm-scalability.time.voluntary_context_switches 1.662e+09 -3.7% 1.601e+09 -1.5% 1.638e+09 vm-scalability.workload 79922 ± 3% +9.3% 87378 ± 3% +3.3% 82589 ± 8% numa-meminfo.node1.SUnreclaim 399014 ±192% -84.9% 60246 ±129% -13.6% 344869 ±239% numa-meminfo.node1.Unevictable 2022 ± 3% +11.6% 2257 +3.6% 2095 vmstat.system.cs 539357 ± 2% +187.0% 1547747 ± 8% +32.9% 716886 ± 4% vmstat.system.in 0.00 ±184% +0.0 0.00 ± 6% +0.0 0.00 ± 25% mpstat.cpu.all.iowait% 2.58 +1.7 4.27 ± 4% +0.5 3.09 ± 3% mpstat.cpu.all.irq% 4.06 ± 3% +2.3 6.36 ± 3% +0.3 4.40 ± 3% mpstat.cpu.all.sys% 19980 ± 3% +9.3% 21844 ± 3% +3.3% 20646 ± 8% numa-vmstat.node1.nr_slab_unreclaimable 99752 ±192% -84.9% 15061 ±129% -13.6% 86216 ±239% numa-vmstat.node1.nr_unevictable 99752 ±192% -84.9% 15061 ±129% -13.6% 86216 ±239% numa-vmstat.node1.nr_zone_unevictable 205569 ± 7% +131.1% 475135 ± 99% +66.5% 342364 ± 91% turbostat.C1 1.382e+09 ± 2% +140.0% 3.317e+09 ± 5% +30.4% 1.803e+09 ± 3% turbostat.IRQ 9095 ± 14% +446.4% 49695 ± 7% +149.0% 22643 ± 11% turbostat.POLL 86.84 -2.4% 84.76 -1.4% 85.63 turbostat.RAMWatt 200293 ± 3% -7.3% 185675 ± 2% -4.3% 191707 ± 2% time.involuntary_context_switches 67.11 ± 56% -92.3% 5.17 ± 55% -95.4% 3.11 ± 80% time.major_page_faults 67952989 ± 5% +43.1% 97269013 ± 2% +35.6% 92147668 ± 3% time.minor_page_faults 9006 -1.8% 8844 -0.6% 8956 time.percent_of_cpu_this_job_got 1178 ± 3% +57.2% 1852 ± 3% +8.6% 1278 ± 3% time.system_time 26327 -4.5% 25132 -1.0% 26056 time.user_time 11378 ± 5% +359.9% 52332 ± 7% +118.5% 24867 ± 7% time.voluntary_context_switches 143480 ± 3% -20.9% 113504 ± 11% -12.0% 126262 ± 4% sched_debug.cfs_rq:/.min_vruntime.stddev 548123 ± 7% -49.1% 279239 ± 34% -20.7% 434543 ± 9% sched_debug.cfs_rq:/.spread0.avg 655329 ± 6% -36.3% 417735 ± 22% -16.2% 549218 ± 6% sched_debug.cfs_rq:/.spread0.max 143388 ± 3% -20.8% 113612 ± 11% -11.9% 126295 ± 4% sched_debug.cfs_rq:/.spread0.stddev 39.81 ± 28% +45.0% 57.73 ± 19% +17.8% 46.89 ± 44% sched_debug.cfs_rq:/.util_est_enqueued.stddev 240478 ± 6% -12.9% 209367 ± 7% -12.0% 211715 ± 5% sched_debug.cpu.avg_idle.avg 1597 +10.4% 1763 ± 3% +2.3% 1633 sched_debug.cpu.clock_task.stddev 1938 ± 5% +29.1% 2503 +11.4% 2160 ± 3% sched_debug.cpu.nr_switches.min 39960890 ± 6% +68.3% 67272793 ± 2% +54.7% 61837739 ± 4% proc-vmstat.numa_hint_faults 19987976 ± 6% +68.7% 33722069 ± 2% +55.1% 30996483 ± 4% proc-vmstat.numa_hint_faults_local 28840932 ± 3% +6.9% 30817082 ± 5% +8.0% 31160418 ± 4% proc-vmstat.numa_hit 28753783 ± 3% +6.9% 30727992 ± 5% +8.1% 31074486 ± 4% proc-vmstat.numa_local 19745743 ± 5% +10.0% 21720583 ± 7% +11.8% 22080123 ± 6% proc-vmstat.numa_pages_migrated 40107839 ± 6% +68.1% 67430626 ± 2% +54.6% 61988683 ± 4% proc-vmstat.numa_pte_updates 37158989 ± 2% +5.3% 39124260 ± 3% +6.3% 39482935 ± 3% proc-vmstat.pgalloc_normal 68856116 ± 5% +42.6% 98184580 ± 2% +35.1% 93057570 ± 3% proc-vmstat.pgfault 19745743 ± 5% +10.0% 21720583 ± 7% +11.8% 22080123 ± 6% proc-vmstat.pgmigrate_success 19754280 ± 5% +10.0% 21735325 ± 7% +11.8% 22080663 ± 6% proc-vmstat.pgreuse 0.17 ± 7% +0.1 0.23 ± 3% +0.0 0.18 ± 5% perf-stat.i.branch-miss-rate% 8953845 ± 3% +61.0% 14417578 ± 3% +13.3% 10142474 ± 2% perf-stat.i.branch-misses 66.30 -1.8 64.47 -0.3 65.98 perf-stat.i.cache-miss-rate% 1904 ± 3% +12.3% 2139 +3.9% 1979 perf-stat.i.context-switches 158.09 +11.3% 175.92 ± 3% +7.5% 170.00 ± 2% perf-stat.i.cpu-migrations 0.04 ± 9% +0.0 0.05 ± 11% +0.0 0.04 ± 7% perf-stat.i.dTLB-load-miss-rate% 4856144 ± 8% +41.5% 6870029 ± 9% +12.3% 5455416 ± 7% perf-stat.i.dTLB-load-misses 9.10 -0.4 8.71 -0.1 8.97 perf-stat.i.dTLB-store-miss-rate% 5.33e+08 -4.4% 5.095e+08 -1.8% 5.233e+08 perf-stat.i.dTLB-store-misses 2454429 ± 2% +159.7% 6374895 ± 7% +26.7% 3110501 ± 5% perf-stat.i.iTLB-load-misses 116140 ± 2% +60.9% 186840 ± 7% -3.6% 111933 ± 4% perf-stat.i.iTLB-loads 41691 ± 5% -23.0% 32083 ± 26% +1.7% 42380 ± 20% perf-stat.i.instructions-per-iTLB-miss 0.31 ± 38% -59.1% 0.13 ± 27% -68.9% 0.10 ± 31% perf-stat.i.major-faults 224958 ± 5% +42.4% 320417 ± 2% +35.4% 304571 ± 3% perf-stat.i.minor-faults 50.61 +1.6 52.22 +0.7 51.35 perf-stat.i.node-load-miss-rate% 1.169e+08 +3.3% 1.208e+08 +0.9% 1.179e+08 perf-stat.i.node-load-misses 1.132e+08 -3.7% 1.089e+08 -2.1% 1.108e+08 perf-stat.i.node-loads 2.688e+08 -3.9% 2.582e+08 -1.8% 2.64e+08 perf-stat.i.node-store-misses 2.664e+08 -4.5% 2.543e+08 -1.7% 2.618e+08 perf-stat.i.node-stores 224959 ± 5% +42.4% 320418 ± 2% +35.4% 304571 ± 3% perf-stat.i.page-faults 0.08 ± 4% +0.0 0.12 ± 4% +0.0 0.09 ± 3% perf-stat.overall.branch-miss-rate% 67.15 -1.9 65.28 -0.5 66.64 perf-stat.overall.cache-miss-rate% 366.74 +2.9% 377.43 +1.2% 371.26 perf-stat.overall.cycles-between-cache-misses 0.03 ± 8% +0.0 0.05 ± 10% +0.0 0.04 ± 8% perf-stat.overall.dTLB-load-miss-rate% 9.38 -0.4 8.97 -0.1 9.25 perf-stat.overall.dTLB-store-miss-rate% 95.49 +1.7 97.16 +1.0 96.53 perf-stat.overall.iTLB-load-miss-rate% 20490 ± 3% -61.8% 7826 ± 7% -21.5% 16077 ± 6% perf-stat.overall.instructions-per-iTLB-miss 50.81 +1.8 52.60 +0.8 51.56 perf-stat.overall.node-load-miss-rate% 9210 +3.0% 9485 +0.7% 9271 perf-stat.overall.path-length 8906114 ± 3% +61.8% 14412101 ± 3% +13.3% 10090374 ± 2% perf-stat.ps.branch-misses 1906 ± 3% +12.3% 2142 +3.8% 1979 perf-stat.ps.context-switches 157.57 +11.7% 176.03 ± 3% +7.6% 169.49 ± 2% perf-stat.ps.cpu-migrations 4843373 ± 8% +41.9% 6871859 ± 9% +12.3% 5440606 ± 7% perf-stat.ps.dTLB-load-misses 5.313e+08 -4.4% 5.077e+08 -1.8% 5.218e+08 perf-stat.ps.dTLB-store-misses 2444301 ± 2% +161.3% 6385873 ± 7% +26.8% 3098710 ± 5% perf-stat.ps.iTLB-load-misses 115384 ± 2% +61.5% 186290 ± 7% -3.7% 111109 ± 4% perf-stat.ps.iTLB-loads 0.31 ± 38% -59.0% 0.13 ± 27% -68.8% 0.10 ± 31% perf-stat.ps.major-faults 224444 ± 5% +42.8% 320615 ± 2% +35.3% 303619 ± 3% perf-stat.ps.minor-faults 1.165e+08 +3.4% 1.205e+08 +0.9% 1.176e+08 perf-stat.ps.node-load-misses 1.128e+08 -3.8% 1.086e+08 -2.1% 1.105e+08 perf-stat.ps.node-loads 2.68e+08 -4.0% 2.573e+08 -1.8% 2.632e+08 perf-stat.ps.node-store-misses 2.656e+08 -4.6% 2.534e+08 -1.7% 2.61e+08 perf-stat.ps.node-stores 224444 ± 5% +42.8% 320615 ± 2% +35.3% 303620 ± 3% perf-stat.ps.page-faults 19.08 ± 10% -1.7 17.34 ± 4% +0.5 19.59 perf-profile.calltrace.cycles-pp.nrand48_r 1.26 ± 15% -1.3 0.00 -1.3 0.00 perf-profile.calltrace.cycles-pp.migrate_folio_unmap.migrate_pages_batch.migrate_pages.migrate_misplaced_page.do_numa_page 1.14 ± 15% -1.1 0.00 -1.1 0.00 perf-profile.calltrace.cycles-pp.try_to_migrate.migrate_folio_unmap.migrate_pages_batch.migrate_pages.migrate_misplaced_page 1.12 ± 15% -1.1 0.00 -1.1 0.00 perf-profile.calltrace.cycles-pp.rmap_walk_anon.try_to_migrate.migrate_folio_unmap.migrate_pages_batch.migrate_pages 1.08 ± 15% -1.1 0.00 -1.1 0.00 perf-profile.calltrace.cycles-pp.try_to_migrate_one.rmap_walk_anon.try_to_migrate.migrate_folio_unmap.migrate_pages_batch 0.92 ± 15% -0.9 0.00 -0.9 0.00 perf-profile.calltrace.cycles-pp.ptep_clear_flush.try_to_migrate_one.rmap_walk_anon.try_to_migrate.migrate_folio_unmap 0.91 ± 15% -0.9 0.00 -0.9 0.00 perf-profile.calltrace.cycles-pp.flush_tlb_mm_range.ptep_clear_flush.try_to_migrate_one.rmap_walk_anon.try_to_migrate 0.91 ± 15% -0.9 0.00 -0.9 0.00 perf-profile.calltrace.cycles-pp.on_each_cpu_cond_mask.flush_tlb_mm_range.ptep_clear_flush.try_to_migrate_one.rmap_walk_anon 0.91 ± 15% -0.9 0.00 -0.9 0.00 perf-profile.calltrace.cycles-pp.smp_call_function_many_cond.on_each_cpu_cond_mask.flush_tlb_mm_range.ptep_clear_flush.try_to_migrate_one 6.40 ± 9% -0.5 5.94 ± 4% +0.1 6.54 perf-profile.calltrace.cycles-pp.lrand48_r 0.26 ±112% -0.3 0.00 -0.3 0.00 perf-profile.calltrace.cycles-pp._raw_spin_lock.do_numa_page.__handle_mm_fault.handle_mm_fault.do_user_addr_fault 0.19 ±141% -0.2 0.00 -0.2 0.00 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.do_numa_page.__handle_mm_fault.handle_mm_fault 4.13 ± 3% -0.1 4.04 -0.0 4.12 perf-profile.calltrace.cycles-pp.do_rw_once 0.06 ±282% -0.1 0.00 -0.1 0.00 perf-profile.calltrace.cycles-pp.rmap_walk_anon.migrate_pages_batch.migrate_pages.migrate_misplaced_page.do_numa_page 0.13 ±188% +0.1 0.24 ±144% -0.0 0.11 ±187% perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.nrand48_r 0.00 +0.1 0.10 ±223% +0.0 0.00 perf-profile.calltrace.cycles-pp.update_load_avg.task_tick_fair.scheduler_tick.update_process_times.tick_sched_handle 0.00 +0.1 0.11 ±223% +0.0 0.00 perf-profile.calltrace.cycles-pp.update_curr.task_tick_fair.scheduler_tick.update_process_times.tick_sched_handle 0.07 ±282% +0.1 0.21 ±144% -0.1 0.00 perf-profile.calltrace.cycles-pp.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.nrand48_r 0.07 ±282% +0.1 0.21 ±144% -0.1 0.00 perf-profile.calltrace.cycles-pp.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.nrand48_r 0.07 ±282% +0.1 0.22 ±144% -0.1 0.00 perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.nrand48_r 0.00 +0.2 0.17 ±141% +0.0 0.00 perf-profile.calltrace.cycles-pp.__default_send_IPI_dest_field.default_send_IPI_mask_sequence_phys.smp_call_function_many_cond.on_each_cpu_cond_mask.arch_tlbbatch_flush 0.00 +0.3 0.26 ±100% +0.0 0.00 perf-profile.calltrace.cycles-pp.asm_sysvec_call_function.nrand48_r 0.00 +0.4 0.36 ± 70% +0.1 0.06 ±282% perf-profile.calltrace.cycles-pp.migrate_folio_extra.move_to_new_folio.migrate_pages_batch.migrate_pages.migrate_misplaced_page 0.00 +0.4 0.36 ± 70% +0.1 0.06 ±282% perf-profile.calltrace.cycles-pp.move_to_new_folio.migrate_pages_batch.migrate_pages.migrate_misplaced_page.do_numa_page 1.44 ± 28% +0.5 1.94 ± 61% +0.1 1.51 ± 25% perf-profile.calltrace.cycles-pp.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.do_access 1.43 ± 29% +0.5 1.93 ± 61% +0.1 1.50 ± 25% perf-profile.calltrace.cycles-pp.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.do_access 0.55 ± 69% +0.5 1.08 ± 69% +0.0 0.60 ± 56% perf-profile.calltrace.cycles-pp.scheduler_tick.update_process_times.tick_sched_handle.tick_sched_timer.__hrtimer_run_queues 1.34 ± 39% +0.6 1.90 ± 69% +0.0 1.35 ± 25% perf-profile.calltrace.cycles-pp.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt 0.17 ±196% +0.6 0.73 ± 85% +0.2 0.33 ± 89% perf-profile.calltrace.cycles-pp.task_tick_fair.scheduler_tick.update_process_times.tick_sched_handle.tick_sched_timer 1.72 ± 25% +0.6 2.30 ± 48% +0.1 1.80 ± 22% perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.do_access 1.08 ± 31% +0.6 1.66 ± 72% +0.1 1.13 ± 26% perf-profile.calltrace.cycles-pp.update_process_times.tick_sched_handle.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt 1.52 ± 28% +0.6 2.11 ± 52% +0.1 1.58 ± 25% perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.do_access 1.09 ± 31% +0.6 1.68 ± 72% +0.1 1.14 ± 26% perf-profile.calltrace.cycles-pp.tick_sched_handle.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt 1.18 ± 30% +0.6 1.78 ± 70% +0.1 1.24 ± 26% perf-profile.calltrace.cycles-pp.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt 0.00 +0.6 0.60 ± 8% +0.0 0.00 perf-profile.calltrace.cycles-pp.default_send_IPI_mask_sequence_phys.smp_call_function_many_cond.on_each_cpu_cond_mask.arch_tlbbatch_flush.try_to_unmap_flush 0.00 +0.6 0.64 ± 7% +0.0 0.00 perf-profile.calltrace.cycles-pp.flush_tlb_func.__flush_smp_call_function_queue.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function 0.00 +0.9 0.90 ± 10% +0.0 0.00 perf-profile.calltrace.cycles-pp.llist_reverse_order.__flush_smp_call_function_queue.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function 72.48 ± 3% +1.4 73.88 -0.7 71.79 perf-profile.calltrace.cycles-pp.do_access 0.00 +1.9 1.86 ± 9% +0.3 0.26 ±113% perf-profile.calltrace.cycles-pp.__flush_smp_call_function_queue.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function.do_access 0.00 +1.9 1.87 ± 8% +0.3 0.26 ±113% perf-profile.calltrace.cycles-pp.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function.do_access 0.00 +1.9 1.94 ± 8% +0.3 0.33 ± 91% perf-profile.calltrace.cycles-pp.sysvec_call_function.asm_sysvec_call_function.do_access 0.00 +2.6 2.59 ± 9% +0.6 0.59 ± 40% perf-profile.calltrace.cycles-pp.asm_sysvec_call_function.do_access 0.00 +2.8 2.80 ± 8% +0.9 0.90 ± 18% perf-profile.calltrace.cycles-pp.llist_add_batch.smp_call_function_many_cond.on_each_cpu_cond_mask.arch_tlbbatch_flush.try_to_unmap_flush 3.30 ± 15% +6.6 9.88 ± 7% +0.9 4.18 ± 19% perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault 3.34 ± 15% +6.6 9.94 ± 7% +0.9 4.22 ± 19% perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.do_access 3.03 ± 15% +6.7 9.69 ± 7% +1.0 4.03 ± 19% perf-profile.calltrace.cycles-pp.do_numa_page.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault 3.68 ± 15% +6.8 10.48 ± 7% +0.9 4.63 ± 19% perf-profile.calltrace.cycles-pp.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.do_access 3.70 ± 15% +6.8 10.49 ± 7% +0.9 4.64 ± 19% perf-profile.calltrace.cycles-pp.exc_page_fault.asm_exc_page_fault.do_access 3.89 ± 14% +6.8 10.71 ± 7% +1.0 4.85 ± 19% perf-profile.calltrace.cycles-pp.asm_exc_page_fault.do_access 2.46 ± 15% +7.0 9.46 ± 7% +1.4 3.85 ± 19% perf-profile.calltrace.cycles-pp.migrate_misplaced_page.do_numa_page.__handle_mm_fault.handle_mm_fault.do_user_addr_fault 2.27 ± 15% +7.0 9.28 ± 7% +1.4 3.67 ± 19% perf-profile.calltrace.cycles-pp.migrate_pages_batch.migrate_pages.migrate_misplaced_page.do_numa_page.__handle_mm_fault 2.27 ± 15% +7.0 9.29 ± 7% +1.4 3.68 ± 19% perf-profile.calltrace.cycles-pp.migrate_pages.migrate_misplaced_page.do_numa_page.__handle_mm_fault.handle_mm_fault 0.00 +7.5 7.50 ± 7% +2.4 2.38 ± 18% perf-profile.calltrace.cycles-pp.smp_call_function_many_cond.on_each_cpu_cond_mask.arch_tlbbatch_flush.try_to_unmap_flush.migrate_pages_batch 0.00 +7.6 7.56 ± 7% +2.4 2.40 ± 18% perf-profile.calltrace.cycles-pp.on_each_cpu_cond_mask.arch_tlbbatch_flush.try_to_unmap_flush.migrate_pages_batch.migrate_pages 0.00 +7.6 7.57 ± 8% +2.4 2.40 ± 18% perf-profile.calltrace.cycles-pp.arch_tlbbatch_flush.try_to_unmap_flush.migrate_pages_batch.migrate_pages.migrate_misplaced_page 0.00 +7.6 7.57 ± 7% +2.4 2.40 ± 18% perf-profile.calltrace.cycles-pp.try_to_unmap_flush.migrate_pages_batch.migrate_pages.migrate_misplaced_page.do_numa_page 16.69 ± 10% -1.3 15.43 ± 5% +0.5 17.16 perf-profile.children.cycles-pp.nrand48_r 1.51 ± 16% -1.1 0.42 ± 9% -1.2 0.31 ± 20% perf-profile.children.cycles-pp.rmap_walk_anon 1.25 ± 16% -1.0 0.30 ± 9% -1.0 0.29 ± 20% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath 0.92 ± 15% -0.9 0.00 -0.9 0.00 perf-profile.children.cycles-pp.ptep_clear_flush 0.92 ± 15% -0.9 0.00 -0.9 0.00 perf-profile.children.cycles-pp.flush_tlb_mm_range 9.27 ± 8% -0.9 8.37 ± 4% +0.2 9.45 perf-profile.children.cycles-pp.lrand48_r 1.08 ± 15% -0.9 0.18 ± 6% -1.0 0.12 ± 21% perf-profile.children.cycles-pp.try_to_migrate_one 1.14 ± 15% -0.9 0.26 ± 8% -0.9 0.19 ± 19% perf-profile.children.cycles-pp.try_to_migrate 1.05 ± 15% -0.8 0.21 ± 11% -0.9 0.16 ± 16% perf-profile.children.cycles-pp._raw_spin_lock 1.26 ± 15% -0.8 0.42 ± 8% -0.9 0.34 ± 21% perf-profile.children.cycles-pp.migrate_folio_unmap 0.46 ± 15% -0.3 0.14 ± 13% -0.3 0.11 ± 20% perf-profile.children.cycles-pp.page_vma_mapped_walk 0.34 ± 15% -0.2 0.11 ± 11% -0.3 0.08 ± 18% perf-profile.children.cycles-pp.remove_migration_pte 0.14 ± 16% -0.1 0.00 -0.1 0.00 perf-profile.children.cycles-pp.handle_pte_fault 4.37 ± 3% -0.1 4.29 -0.0 4.36 perf-profile.children.cycles-pp.do_rw_once 0.13 ± 22% -0.1 0.07 ± 11% -0.0 0.09 ± 23% perf-profile.children.cycles-pp.folio_lruvec_lock_irq 0.13 ± 22% -0.1 0.08 ± 10% -0.0 0.09 ± 22% perf-profile.children.cycles-pp._raw_spin_lock_irq 0.33 ± 2% -0.0 0.30 -0.0 0.32 ± 2% perf-profile.children.cycles-pp.lrand48_r@plt 0.17 ± 21% -0.0 0.14 ± 9% -0.0 0.15 ± 21% perf-profile.children.cycles-pp.folio_isolate_lru 0.02 ±112% -0.0 0.00 +0.0 0.03 ±111% perf-profile.children.cycles-pp.timerqueue_del 0.19 ± 20% -0.0 0.17 ± 8% -0.0 0.17 ± 20% perf-profile.children.cycles-pp.numamigrate_isolate_page 0.06 ± 13% -0.0 0.04 ± 45% -0.0 0.05 ± 37% perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe 0.06 ± 13% -0.0 0.04 ± 45% -0.0 0.05 ± 37% perf-profile.children.cycles-pp.do_syscall_64 0.01 ±193% -0.0 0.00 -0.0 0.01 ±188% perf-profile.children.cycles-pp.perf_mux_hrtimer_handler 0.09 ± 20% -0.0 0.08 ± 47% +0.0 0.09 ± 23% perf-profile.children.cycles-pp.tick_sched_do_timer 0.07 ± 39% -0.0 0.06 ± 45% +0.0 0.07 ± 28% perf-profile.children.cycles-pp.ktime_get_update_offsets_now 0.01 ±282% -0.0 0.00 -0.0 0.01 ±282% perf-profile.children.cycles-pp.perf_rotate_context 0.02 ±111% -0.0 0.02 ±142% +0.0 0.03 ±112% perf-profile.children.cycles-pp.irqtime_account_process_tick 0.06 ± 39% -0.0 0.06 ± 8% +0.0 0.07 ± 21% perf-profile.children.cycles-pp.rmqueue_bulk 0.00 +0.0 0.00 +0.0 0.01 ±282% perf-profile.children.cycles-pp.__free_one_page 0.00 +0.0 0.00 +0.0 0.01 ±187% perf-profile.children.cycles-pp.lru_add_fn 0.07 ± 27% +0.0 0.07 ± 47% -0.0 0.06 ± 55% perf-profile.children.cycles-pp.ktime_get 0.09 ± 15% +0.0 0.10 ± 8% +0.0 0.11 ± 21% perf-profile.children.cycles-pp.rmqueue 0.09 ± 39% +0.0 0.10 ± 50% -0.0 0.07 ± 75% perf-profile.children.cycles-pp.cpuacct_account_field 0.00 +0.0 0.01 ±223% +0.0 0.00 perf-profile.children.cycles-pp.run_posix_cpu_timers 0.00 +0.0 0.01 ±223% +0.0 0.00 perf-profile.children.cycles-pp.nohz_balance_exit_idle 0.00 +0.0 0.01 ±223% +0.0 0.00 perf-profile.children.cycles-pp.reweight_entity 0.00 +0.0 0.01 ±223% +0.0 0.00 perf-profile.children.cycles-pp.__hrtimer_next_event_base 0.00 +0.0 0.01 ±223% +0.0 0.00 perf-profile.children.cycles-pp.nohz_balancer_kick 0.00 +0.0 0.01 ±223% +0.0 0.00 perf-profile.children.cycles-pp.trigger_load_balance 0.00 +0.0 0.01 ±223% +0.0 0.00 perf-profile.children.cycles-pp.check_cpu_stall 0.00 +0.0 0.01 ±223% +0.0 0.00 perf-profile.children.cycles-pp.perf_event_task_tick 0.09 ± 16% +0.0 0.10 ± 7% +0.0 0.11 ± 22% perf-profile.children.cycles-pp.__alloc_pages 0.09 ± 16% +0.0 0.10 ± 10% +0.0 0.11 ± 21% perf-profile.children.cycles-pp.get_page_from_freelist 0.00 +0.0 0.01 ±223% +0.0 0.00 perf-profile.children.cycles-pp.acct_account_cputime 0.09 ± 18% +0.0 0.10 ± 7% +0.0 0.11 ± 22% perf-profile.children.cycles-pp.__folio_alloc 0.01 ±282% +0.0 0.02 ±142% -0.0 0.00 perf-profile.children.cycles-pp.rcu_core 0.32 ± 19% +0.0 0.34 ± 45% +0.0 0.33 ± 32% perf-profile.children.cycles-pp.account_user_time 0.12 ± 95% +0.0 0.14 ± 6% -0.0 0.11 ± 16% perf-profile.children.cycles-pp.irqentry_exit_to_user_mode 0.09 ± 18% +0.0 0.11 ± 9% +0.0 0.11 ± 22% perf-profile.children.cycles-pp.alloc_misplaced_dst_page 0.06 ± 18% +0.0 0.08 ± 69% +0.0 0.07 ± 41% perf-profile.children.cycles-pp.rcu_pending 0.00 +0.0 0.02 ±141% +0.0 0.00 perf-profile.children.cycles-pp.set_tlb_ubc_flush_pending 0.00 +0.0 0.02 ±141% +0.0 0.00 perf-profile.children.cycles-pp.folio_lock_anon_vma_read 0.00 +0.0 0.02 ±141% +0.0 0.01 ±282% perf-profile.children.cycles-pp.folio_get_anon_vma 0.06 ± 18% +0.0 0.08 ± 9% +0.0 0.06 ± 19% perf-profile.children.cycles-pp.mt_find 0.21 ± 17% +0.0 0.23 ± 8% -0.0 0.21 ± 18% perf-profile.children.cycles-pp._raw_spin_lock_irqsave 0.06 ± 16% +0.0 0.08 ± 8% +0.0 0.08 ± 21% perf-profile.children.cycles-pp.free_unref_page 0.06 ± 18% +0.0 0.08 ± 11% +0.0 0.06 ± 20% perf-profile.children.cycles-pp.find_vma 0.11 ± 16% +0.0 0.12 ± 66% +0.0 0.13 ± 29% perf-profile.children.cycles-pp.__cgroup_account_cputime_field 0.01 ±282% +0.0 0.03 ±102% -0.0 0.00 perf-profile.children.cycles-pp.lapic_next_deadline 0.03 ± 71% +0.0 0.06 ± 8% +0.0 0.05 ± 39% perf-profile.children.cycles-pp.free_pcppages_bulk 0.02 ±209% +0.0 0.04 ±103% -0.0 0.02 ±142% perf-profile.children.cycles-pp.update_cfs_group 0.01 ±282% +0.0 0.03 ±105% -0.0 0.00 perf-profile.children.cycles-pp.hrtimer_update_next_event 0.05 ± 43% +0.0 0.08 ± 61% -0.0 0.05 ± 57% perf-profile.children.cycles-pp.update_irq_load_avg 0.00 +0.0 0.02 ± 99% +0.0 0.00 perf-profile.children.cycles-pp.__perf_sw_event 0.08 ± 15% +0.0 0.10 ± 10% +0.0 0.10 ± 21% perf-profile.children.cycles-pp.__list_del_entry_valid 0.09 ± 47% +0.0 0.12 ± 70% -0.0 0.08 ± 43% perf-profile.children.cycles-pp.hrtimer_active 0.01 ±282% +0.0 0.03 ±106% -0.0 0.00 perf-profile.children.cycles-pp.update_min_vruntime 0.08 ± 18% +0.0 0.11 ± 68% +0.0 0.09 ± 26% perf-profile.children.cycles-pp.rcu_sched_clock_irq 0.07 ± 35% +0.0 0.10 ± 33% +0.0 0.08 ± 26% perf-profile.children.cycles-pp.clockevents_program_event 0.01 ±282% +0.0 0.04 ±110% -0.0 0.00 perf-profile.children.cycles-pp.timerqueue_add 0.04 ± 91% +0.0 0.07 ± 50% +0.0 0.06 ± 38% perf-profile.children.cycles-pp.arch_scale_freq_tick 0.02 ±154% +0.0 0.06 ± 74% +0.0 0.03 ± 92% perf-profile.children.cycles-pp.__do_softirq 0.00 +0.0 0.04 ± 71% +0.0 0.02 ±142% perf-profile.children.cycles-pp.can_change_pte_writable 0.01 ±282% +0.0 0.04 ±107% -0.0 0.00 perf-profile.children.cycles-pp.enqueue_hrtimer 0.00 +0.0 0.04 ± 44% +0.0 0.00 perf-profile.children.cycles-pp.tlb_is_not_lazy 0.00 +0.0 0.04 ± 45% +0.0 0.00 perf-profile.children.cycles-pp._raw_spin_unlock_irqrestore 0.15 ± 20% +0.0 0.20 ± 8% -0.0 0.15 ± 21% perf-profile.children.cycles-pp.folio_lruvec_lock_irqsave 0.11 ± 25% +0.0 0.16 ± 64% +0.0 0.11 ± 25% perf-profile.children.cycles-pp.update_rq_clock 0.03 ±118% +0.1 0.08 ± 58% +0.0 0.05 ± 59% perf-profile.children.cycles-pp.perf_trace_sched_stat_runtime 0.03 ±127% +0.1 0.09 ± 84% +0.0 0.04 ± 72% perf-profile.children.cycles-pp.__update_load_avg_cfs_rq 0.00 +0.1 0.06 ± 9% +0.0 0.02 ±142% perf-profile.children.cycles-pp.folio_migrate_flags 0.03 ±152% +0.1 0.09 ± 68% +0.0 0.04 ± 72% perf-profile.children.cycles-pp.__update_load_avg_se 0.00 +0.1 0.07 ± 8% +0.0 0.00 perf-profile.children.cycles-pp.native_sched_clock 0.05 ± 36% +0.1 0.12 ± 8% +0.1 0.10 ± 18% perf-profile.children.cycles-pp.exit_to_user_mode_loop 0.06 ± 13% +0.1 0.13 ± 8% +0.1 0.11 ± 16% perf-profile.children.cycles-pp.exit_to_user_mode_prepare 0.16 ± 13% +0.1 0.24 ± 10% +0.0 0.18 ± 19% perf-profile.children.cycles-pp.up_read 0.00 +0.1 0.08 ± 10% +0.0 0.00 perf-profile.children.cycles-pp.sched_clock_cpu 0.02 ±141% +0.1 0.10 ± 8% +0.0 0.05 ± 42% perf-profile.children.cycles-pp.uncharge_batch 0.01 ±282% +0.1 0.09 ± 12% +0.0 0.04 ± 75% perf-profile.children.cycles-pp.page_counter_uncharge 0.04 ± 71% +0.1 0.12 ± 8% +0.1 0.10 ± 18% perf-profile.children.cycles-pp.task_work_run 0.00 +0.1 0.09 ± 10% +0.0 0.01 ±282% perf-profile.children.cycles-pp._find_next_bit 0.02 ±141% +0.1 0.10 ± 10% +0.0 0.06 ± 44% perf-profile.children.cycles-pp.__mem_cgroup_uncharge 0.02 ±141% +0.1 0.10 ± 10% +0.0 0.06 ± 44% perf-profile.children.cycles-pp.__folio_put 0.19 ± 17% +0.1 0.28 ± 11% +0.0 0.21 ± 18% perf-profile.children.cycles-pp.down_read_trylock 0.03 ± 90% +0.1 0.12 ± 8% +0.1 0.10 ± 16% perf-profile.children.cycles-pp.change_pte_range 0.03 ± 90% +0.1 0.12 ± 8% +0.1 0.10 ± 18% perf-profile.children.cycles-pp.task_numa_work 0.03 ± 90% +0.1 0.12 ± 8% +0.1 0.10 ± 18% perf-profile.children.cycles-pp.change_prot_numa 0.03 ± 90% +0.1 0.12 ± 8% +0.1 0.10 ± 18% perf-profile.children.cycles-pp.change_protection_range 0.03 ± 90% +0.1 0.12 ± 8% +0.1 0.10 ± 18% perf-profile.children.cycles-pp.change_pmd_range 0.21 ± 19% +0.1 0.31 ± 8% +0.0 0.22 ± 21% perf-profile.children.cycles-pp.folio_batch_move_lru 0.02 ±142% +0.1 0.12 ± 6% +0.0 0.04 ± 72% perf-profile.children.cycles-pp.irqtime_account_irq 0.08 ± 36% +0.1 0.18 ± 24% +0.0 0.09 ± 24% perf-profile.children.cycles-pp.__irq_exit_rcu 0.21 ± 19% +0.1 0.31 ± 8% +0.0 0.22 ± 20% perf-profile.children.cycles-pp.lru_add_drain 0.21 ± 19% +0.1 0.31 ± 8% +0.0 0.22 ± 20% perf-profile.children.cycles-pp.lru_add_drain_cpu 0.03 ± 71% +0.1 0.14 ± 8% +0.0 0.08 ± 25% perf-profile.children.cycles-pp.mem_cgroup_migrate 0.01 ±187% +0.1 0.13 ± 6% +0.1 0.07 ± 26% perf-profile.children.cycles-pp.page_counter_charge 0.17 ± 13% +0.1 0.30 ± 9% +0.1 0.24 ± 19% perf-profile.children.cycles-pp.folio_copy 0.17 ± 14% +0.1 0.30 ± 9% +0.1 0.23 ± 20% perf-profile.children.cycles-pp.copy_page 0.09 ± 7% +0.2 0.24 ± 9% +0.0 0.11 ± 14% perf-profile.children.cycles-pp.sync_regs 0.21 ± 48% +0.2 0.39 ± 65% +0.0 0.22 ± 28% perf-profile.children.cycles-pp.update_load_avg 0.25 ± 39% +0.2 0.43 ± 61% +0.0 0.27 ± 25% perf-profile.children.cycles-pp.update_curr 0.25 ± 12% +0.3 0.51 ± 8% +0.1 0.36 ± 20% perf-profile.children.cycles-pp.migrate_folio_extra 0.25 ± 12% +0.3 0.51 ± 8% +0.1 0.36 ± 20% perf-profile.children.cycles-pp.move_to_new_folio 0.11 ± 20% +0.3 0.40 ± 7% +0.0 0.16 ± 15% perf-profile.children.cycles-pp.native_irq_return_iret 0.06 ± 40% +0.4 0.47 ± 9% +0.1 0.13 ± 23% perf-profile.children.cycles-pp.__default_send_IPI_dest_field 0.00 +0.4 0.44 ± 9% +0.1 0.12 ± 22% perf-profile.children.cycles-pp.native_flush_tlb_local 0.68 ± 45% +0.5 1.16 ± 62% +0.0 0.71 ± 28% perf-profile.children.cycles-pp.task_tick_fair 0.08 ± 16% +0.5 0.62 ± 9% +0.1 0.17 ± 21% perf-profile.children.cycles-pp.default_send_IPI_mask_sequence_phys 0.96 ± 40% +0.6 1.57 ± 60% +0.0 1.00 ± 27% perf-profile.children.cycles-pp.scheduler_tick 1.56 ± 32% +0.7 2.26 ± 55% +0.1 1.64 ± 25% perf-profile.children.cycles-pp.update_process_times 1.58 ± 32% +0.7 2.29 ± 55% +0.1 1.65 ± 25% perf-profile.children.cycles-pp.tick_sched_handle 1.71 ± 31% +0.7 2.42 ± 54% +0.1 1.79 ± 25% perf-profile.children.cycles-pp.tick_sched_timer 1.85 ± 30% +0.7 2.60 ± 52% +0.1 1.94 ± 25% perf-profile.children.cycles-pp.__hrtimer_run_queues 2.09 ± 29% +0.8 2.86 ± 50% +0.1 2.18 ± 24% perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt 2.06 ± 29% +0.8 2.85 ± 50% +0.1 2.16 ± 24% perf-profile.children.cycles-pp.hrtimer_interrupt 2.48 ± 26% +0.8 3.28 ± 45% +0.1 2.60 ± 22% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt 2.19 ± 29% +0.8 2.99 ± 49% +0.1 2.29 ± 24% perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt 0.09 ± 17% +1.2 1.32 ± 7% +0.4 0.45 ± 21% perf-profile.children.cycles-pp.flush_tlb_func 0.25 ± 14% +1.6 1.85 ± 9% +0.3 0.55 ± 18% perf-profile.children.cycles-pp.llist_reverse_order 72.83 ± 3% +1.9 74.77 -0.6 72.25 perf-profile.children.cycles-pp.do_access 0.40 ± 15% +2.5 2.86 ± 8% +0.5 0.93 ± 18% perf-profile.children.cycles-pp.llist_add_batch 0.41 ± 14% +3.3 3.76 ± 8% +0.7 1.14 ± 19% perf-profile.children.cycles-pp.__sysvec_call_function 0.41 ± 14% +3.4 3.76 ± 8% +0.7 1.14 ± 19% perf-profile.children.cycles-pp.__flush_smp_call_function_queue 0.43 ± 14% +3.5 3.90 ± 8% +0.7 1.17 ± 19% perf-profile.children.cycles-pp.sysvec_call_function 0.55 ± 12% +4.4 4.95 ± 8% +0.9 1.40 ± 19% perf-profile.children.cycles-pp.asm_sysvec_call_function 3.31 ± 15% +6.6 9.89 ± 7% +0.9 4.19 ± 19% perf-profile.children.cycles-pp.__handle_mm_fault 3.34 ± 15% +6.6 9.95 ± 7% +0.9 4.23 ± 19% perf-profile.children.cycles-pp.handle_mm_fault 3.03 ± 15% +6.7 9.69 ± 7% +1.0 4.03 ± 19% perf-profile.children.cycles-pp.do_numa_page 0.91 ± 15% +6.7 7.59 ± 7% +1.5 2.42 ± 18% perf-profile.children.cycles-pp.smp_call_function_many_cond 0.91 ± 15% +6.7 7.59 ± 7% +1.5 2.42 ± 18% perf-profile.children.cycles-pp.on_each_cpu_cond_mask 3.70 ± 15% +6.8 10.49 ± 7% +0.9 4.64 ± 19% perf-profile.children.cycles-pp.do_user_addr_fault 3.70 ± 15% +6.8 10.50 ± 7% +0.9 4.64 ± 19% perf-profile.children.cycles-pp.exc_page_fault 3.91 ± 14% +6.8 10.76 ± 7% +1.0 4.88 ± 19% perf-profile.children.cycles-pp.asm_exc_page_fault 2.46 ± 15% +7.0 9.46 ± 7% +1.4 3.85 ± 19% perf-profile.children.cycles-pp.migrate_misplaced_page 2.27 ± 15% +7.0 9.28 ± 7% +1.4 3.67 ± 19% perf-profile.children.cycles-pp.migrate_pages_batch 2.27 ± 15% +7.0 9.29 ± 7% +1.4 3.68 ± 19% perf-profile.children.cycles-pp.migrate_pages 0.00 +7.6 7.57 ± 7% +2.4 2.40 ± 18% perf-profile.children.cycles-pp.try_to_unmap_flush 0.00 +7.6 7.57 ± 7% +2.4 2.40 ± 18% perf-profile.children.cycles-pp.arch_tlbbatch_flush 66.95 ± 3% -7.7 59.28 ± 2% -2.0 64.95 perf-profile.self.cycles-pp.do_access 13.38 ± 11% -1.4 12.02 ± 4% +0.3 13.71 perf-profile.self.cycles-pp.nrand48_r 8.81 ± 9% -1.1 7.70 ± 3% +0.1 8.94 ± 2% perf-profile.self.cycles-pp.lrand48_r 1.14 ± 16% -0.9 0.28 ± 9% -0.9 0.28 ± 21% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath 4.08 ± 3% -0.3 3.77 -0.0 4.03 perf-profile.self.cycles-pp.do_rw_once 0.06 ±187% -0.1 0.00 -0.1 0.00 perf-profile.self.cycles-pp.irqentry_exit_to_user_mode 0.29 ± 4% -0.0 0.26 -0.0 0.28 ± 2% perf-profile.self.cycles-pp.lrand48_r@plt 0.12 ± 27% -0.0 0.10 ± 53% +0.0 0.13 ± 36% perf-profile.self.cycles-pp.account_user_time 0.02 ±141% -0.0 0.00 +0.0 0.02 ±112% perf-profile.self.cycles-pp.hrtimer_interrupt 0.07 ± 16% -0.0 0.07 ± 47% +0.0 0.08 ± 25% perf-profile.self.cycles-pp.tick_sched_do_timer 0.06 ± 55% -0.0 0.05 ± 46% +0.0 0.06 ± 42% perf-profile.self.cycles-pp.ktime_get_update_offsets_now 0.02 ±111% -0.0 0.02 ±142% +0.0 0.03 ±112% perf-profile.self.cycles-pp.irqtime_account_process_tick 0.01 ±188% -0.0 0.01 ±223% -0.0 0.01 ±282% perf-profile.self.cycles-pp.rmap_walk_anon 0.00 +0.0 0.00 +0.0 0.01 ±282% perf-profile.self.cycles-pp.__free_one_page 0.06 ± 42% +0.0 0.07 ± 46% +0.0 0.07 ± 43% perf-profile.self.cycles-pp.update_process_times 0.09 ± 39% +0.0 0.10 ± 50% -0.0 0.07 ± 75% perf-profile.self.cycles-pp.cpuacct_account_field 0.00 +0.0 0.01 ±223% +0.0 0.00 perf-profile.self.cycles-pp.set_tlb_ubc_flush_pending 0.00 +0.0 0.01 ±223% +0.0 0.00 perf-profile.self.cycles-pp.__irq_exit_rcu 0.00 +0.0 0.01 ±223% +0.0 0.00 perf-profile.self.cycles-pp.perf_event_task_tick 0.00 +0.0 0.01 ±223% +0.0 0.00 perf-profile.self.cycles-pp.run_posix_cpu_timers 0.00 +0.0 0.01 ±223% +0.0 0.00 perf-profile.self.cycles-pp.nohz_balance_exit_idle 0.00 +0.0 0.01 ±223% +0.0 0.00 perf-profile.self.cycles-pp.reweight_entity 0.00 +0.0 0.01 ±223% +0.0 0.01 ±187% perf-profile.self.cycles-pp.can_change_pte_writable 0.06 ± 14% +0.0 0.07 ± 11% -0.0 0.04 ± 72% perf-profile.self.cycles-pp.mt_find 0.00 +0.0 0.01 ±223% +0.0 0.00 perf-profile.self.cycles-pp.trigger_load_balance 0.00 +0.0 0.01 ±223% +0.0 0.00 perf-profile.self.cycles-pp.check_cpu_stall 0.00 +0.0 0.01 ±223% +0.0 0.00 perf-profile.self.cycles-pp.timerqueue_add 0.00 +0.0 0.01 ±223% +0.0 0.00 perf-profile.self.cycles-pp.acct_account_cputime 0.08 ± 17% +0.0 0.09 ± 13% +0.0 0.08 ± 21% perf-profile.self.cycles-pp.page_vma_mapped_walk 0.11 ± 17% +0.0 0.13 ± 15% +0.0 0.12 ± 20% perf-profile.self.cycles-pp.__handle_mm_fault 0.01 ±282% +0.0 0.02 ± 99% +0.0 0.02 ±112% perf-profile.self.cycles-pp._raw_spin_lock_irqsave 0.10 ± 16% +0.0 0.12 ± 65% +0.0 0.12 ± 29% perf-profile.self.cycles-pp.__cgroup_account_cputime_field 0.01 ±282% +0.0 0.03 ±102% -0.0 0.00 perf-profile.self.cycles-pp.lapic_next_deadline 0.01 ±282% +0.0 0.03 ±150% +0.0 0.02 ±112% perf-profile.self.cycles-pp.rcu_pending 0.02 ±209% +0.0 0.04 ±103% -0.0 0.02 ±142% perf-profile.self.cycles-pp.update_cfs_group 0.08 ± 47% +0.0 0.10 ± 68% -0.0 0.07 ± 45% perf-profile.self.cycles-pp.hrtimer_active 0.05 ± 43% +0.0 0.08 ± 61% -0.0 0.05 ± 57% perf-profile.self.cycles-pp.update_irq_load_avg 0.04 ± 94% +0.0 0.06 ± 48% +0.0 0.05 ± 56% perf-profile.self.cycles-pp.ktime_get 0.07 ± 16% +0.0 0.10 ± 10% +0.0 0.10 ± 21% perf-profile.self.cycles-pp.__list_del_entry_valid 0.01 ±282% +0.0 0.03 ±106% -0.0 0.00 perf-profile.self.cycles-pp.update_min_vruntime 0.04 ± 91% +0.0 0.07 ± 50% +0.0 0.06 ± 38% perf-profile.self.cycles-pp.arch_scale_freq_tick 0.00 +0.0 0.03 ± 70% +0.0 0.00 perf-profile.self.cycles-pp.default_send_IPI_mask_sequence_phys 0.01 ±282% +0.0 0.04 ± 75% +0.0 0.02 ±112% perf-profile.self.cycles-pp.__hrtimer_run_queues 0.06 ± 49% +0.0 0.10 ± 65% -0.0 0.06 ± 56% perf-profile.self.cycles-pp.scheduler_tick 0.03 ±113% +0.0 0.07 ± 83% +0.0 0.04 ± 71% perf-profile.self.cycles-pp.update_rq_clock 0.00 +0.0 0.04 ± 44% +0.0 0.01 ±187% perf-profile.self.cycles-pp.folio_migrate_flags 0.09 ± 14% +0.0 0.14 ± 20% +0.0 0.10 ± 16% perf-profile.self.cycles-pp._raw_spin_lock 0.02 ±191% +0.0 0.06 ± 86% +0.0 0.03 ± 90% perf-profile.self.cycles-pp.__update_load_avg_se 0.03 ±118% +0.0 0.08 ± 57% +0.0 0.05 ± 59% perf-profile.self.cycles-pp.perf_trace_sched_stat_runtime 0.02 ±111% +0.1 0.08 ± 10% +0.0 0.06 ± 15% perf-profile.self.cycles-pp.change_pte_range 0.15 ± 14% +0.1 0.20 ± 10% +0.0 0.17 ± 21% perf-profile.self.cycles-pp.up_read 0.00 +0.1 0.05 ± 8% +0.0 0.01 ±188% perf-profile.self.cycles-pp.try_to_migrate_one 0.19 ± 16% +0.1 0.24 ± 11% +0.0 0.20 ± 19% perf-profile.self.cycles-pp.down_read_trylock 0.03 ±151% +0.1 0.09 ± 84% +0.0 0.04 ± 72% perf-profile.self.cycles-pp.__update_load_avg_cfs_rq 0.00 +0.1 0.07 ± 8% +0.0 0.00 perf-profile.self.cycles-pp._find_next_bit 0.00 +0.1 0.07 ± 8% +0.0 0.00 perf-profile.self.cycles-pp.native_sched_clock 0.00 +0.1 0.07 ± 12% +0.0 0.03 ±113% perf-profile.self.cycles-pp.page_counter_uncharge 0.09 ± 41% +0.1 0.16 ± 69% +0.0 0.09 ± 42% perf-profile.self.cycles-pp.task_tick_fair 0.11 ± 49% +0.1 0.19 ± 74% -0.0 0.11 ± 29% perf-profile.self.cycles-pp.update_load_avg 0.01 ±282% +0.1 0.11 ± 8% +0.1 0.06 ± 43% perf-profile.self.cycles-pp.page_counter_charge 0.16 ± 15% +0.1 0.27 ± 9% +0.1 0.22 ± 21% perf-profile.self.cycles-pp.copy_page 0.16 ± 41% +0.1 0.28 ± 65% +0.0 0.18 ± 25% perf-profile.self.cycles-pp.update_curr 0.09 ± 7% +0.2 0.24 ± 9% +0.0 0.11 ± 14% perf-profile.self.cycles-pp.sync_regs 0.11 ± 20% +0.3 0.39 ± 8% +0.0 0.15 ± 15% perf-profile.self.cycles-pp.native_irq_return_iret 0.06 ± 40% +0.4 0.47 ± 9% +0.1 0.13 ± 23% perf-profile.self.cycles-pp.__default_send_IPI_dest_field 0.00 +0.4 0.44 ± 10% +0.1 0.11 ± 19% perf-profile.self.cycles-pp.native_flush_tlb_local 0.07 ± 15% +0.5 0.62 ± 7% +0.1 0.16 ± 18% perf-profile.self.cycles-pp.__flush_smp_call_function_queue 0.06 ± 16% +0.8 0.88 ± 7% +0.3 0.33 ± 21% perf-profile.self.cycles-pp.flush_tlb_func 0.25 ± 14% +1.6 1.85 ± 9% +0.3 0.55 ± 18% perf-profile.self.cycles-pp.llist_reverse_order 0.35 ± 15% +2.1 2.40 ± 8% +0.4 0.76 ± 18% perf-profile.self.cycles-pp.llist_add_batch 0.37 ± 17% +3.1 3.49 ± 7% +0.7 1.10 ± 18% perf-profile.self.cycles-pp.smp_call_function_many_cond > Best Regards, > Huang, Ying > > -------------------------------------8<------------------------------------ > From 1ac61967b54bbdc1ca20af16f9dfb2507a4d4811 Mon Sep 17 00:00:00 2001 > From: Huang Ying <ying.huang@xxxxxxxxx> > Date: Mon, 20 Mar 2023 15:48:39 +0800 > Subject: [PATCH] dbg, rmap: avoid flushing TLB in batch if PTE is inaccessible > > Signed-off-by: "Huang, Ying" <ying.huang@xxxxxxxxx> > --- > mm/rmap.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/mm/rmap.c b/mm/rmap.c > index 8632e02661ac..3c7c43642d7c 100644 > --- a/mm/rmap.c > +++ b/mm/rmap.c > @@ -1582,7 +1582,8 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, > */ > pteval = ptep_get_and_clear(mm, address, pvmw.pte); > > - set_tlb_ubc_flush_pending(mm, pte_dirty(pteval)); > + if (pte_accessible(mm, pteval)) > + set_tlb_ubc_flush_pending(mm, pte_dirty(pteval)); > } else { > pteval = ptep_clear_flush(vma, address, pvmw.pte); > } > @@ -1963,7 +1964,8 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma, > */ > pteval = ptep_get_and_clear(mm, address, pvmw.pte); > > - set_tlb_ubc_flush_pending(mm, pte_dirty(pteval)); > + if (pte_accessible(mm, pteval)) > + set_tlb_ubc_flush_pending(mm, pte_dirty(pteval)); > } else { > pteval = ptep_clear_flush(vma, address, pvmw.pte); > }