Re: [linus:master] [migrate_pages] 7e12beb8ca: vm-scalability.throughput -3.4% regression

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2023-03-21 at 13:43 +0800, Huang, Ying wrote:
> "Liu, Yujie" <yujie.liu@xxxxxxxxx> writes:
> 
> > Hi Ying,
> > 
> > On Mon, 2023-03-20 at 15:58 +0800, Huang, Ying wrote:
> > > Hi, Yujie,
> > > 
> > > kernel test robot <yujie.liu@xxxxxxxxx> writes:
> > > 
> > > > Hello,
> > > > 
> > > > FYI, we noticed a -3.4% regression of vm-scalability.throughput due to commit:
> > > > 
> > > > commit: 7e12beb8ca2ac98b2ec42e0ea4b76cdc93b58654 ("migrate_pages: batch flushing TLB")
> > > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> > > > 
> > > > in testcase: vm-scalability
> > > > on test machine: 96 threads 2 sockets Intel(R) Xeon(R) Platinum 8260L CPU @ 2.40GHz (Cascade Lake) with 128G memory
> > > > with following parameters:
> > > > 
> > > >         runtime: 300s
> > > >         size: 512G
> > > >         test: anon-cow-rand-mt
> > > >         cpufreq_governor: performance
> > > > 
> > > > test-description: The motivation behind this suite is to exercise functions and regions of the mm/ of the Linux kernel which are of interest to us.
> > > > test-url: https://git.kernel.org/cgit/linux/kernel/git/wfg/vm-scalability.git/
> > > > 
> > > > 
> > > > If you fix the issue, kindly add following tag
> > > > > Reported-by: kernel test robot <yujie.liu@xxxxxxxxx>
> > > > > Link: https://lore.kernel.org/oe-lkp/202303192325.ecbaf968-yujie.liu@xxxxxxxxx
> > > > 
> > > 
> > > Thanks a lot for report!  Can you try whether the debug patch as
> > > below can restore the regression?
> > 
> > We've tested the patch and found the throughput score was partially
> > restored from -3.6% to -1.4%, still with a slight performance drop.
> > Please check the detailed data as follows:
> 
> Good!  Thanks for your detailed data!
> 
> >       0.09 ± 17%      +1.2        1.32 ±  7%      +0.4        0.45 ± 21%  perf-profile.children.cycles-pp.flush_tlb_func
> 
> It appears that we can reduce the unnecessary TLB flushing effectively
> with the previous debug patch.  But the batched flush (full flush) is
> still slower than the non-batched flush (flush one page).
> 
> Can you try the debug patch as below to check whether it can restore the
> regression completely?  The new debug patch can be applied on top of the
> previous debug patch.

The second debug patch got a -0.7% performance change. The data have
some fluctuations from test to test, and the standard deviation is even
a bit larger than 0.7%, which make the performance score not very
convincing. Please check other metrics to see if the regression is
fully restored. Thanks.

=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/runtime/size/tbox_group/test/testcase:
  gcc-11/performance/x86_64-rhel-8.3/debian-11.1-x86_64-20220510.cgz/300s/512G/lkp-csl-2sp3/anon-cow-rand-mt/vm-scalability

commit: 
  ebe75e4751063 ("migrate_pages: share more code between _unmap and _move")
  9a30245d65679 ("dbg, rmap: avoid flushing TLB in batch if PTE is inaccessible")
  a65085664418d ("dbg, migrate_pages: don't batch flushing for single page migration")

ebe75e4751063dce 9a30245d656794d171cd798a2be a65085664418d7ed1560095d466 
---------------- --------------------------- ---------------------------
         %stddev     %change         %stddev     %change         %stddev
             \          |                \          |                \  
     57634            -1.5%      56788            -0.8%      57199        vm-scalability.median
     81.16 ± 12%     -20.0       61.18 ± 21%      -5.0       76.14 ± 12%  vm-scalability.stddev%
   5528051            -1.4%    5449450            -0.7%    5487122        vm-scalability.throughput
    305.38            -0.1%     305.19            -0.1%     305.15        vm-scalability.time.elapsed_time
    305.38            -0.1%     305.19            -0.1%     305.15        vm-scalability.time.elapsed_time.max
    652.11 ± 88%     +54.5%       1007 ± 63%     +45.4%     948.20 ± 80%  vm-scalability.time.file_system_inputs
    200293 ±  3%      -4.3%     191707 ±  2%      +1.9%     204033 ±  3%  vm-scalability.time.involuntary_context_switches
     67.11 ± 56%     -95.4%       3.11 ± 80%     -11.3%      59.50 ± 27%  vm-scalability.time.major_page_faults
  32930133            -0.0%   32924571            -0.0%   32922758        vm-scalability.time.maximum_resident_set_size
  67952989 ±  5%     +35.6%   92147668 ±  3%      +2.8%   69849921 ±  8%  vm-scalability.time.minor_page_faults
      4096            +0.0%       4096            +0.0%       4096        vm-scalability.time.page_size
      9006            -0.6%       8956            -0.0%       9005        vm-scalability.time.percent_of_cpu_this_job_got
      1178 ±  3%      +8.6%       1278 ±  3%      -1.9%       1155 ±  4%  vm-scalability.time.system_time
     26327            -1.0%      26056            +0.0%      26327        vm-scalability.time.user_time
     11378 ±  5%    +118.5%      24867 ±  7%      -0.5%      11327 ±  9%  vm-scalability.time.voluntary_context_switches
 1.662e+09            -1.5%  1.638e+09            -0.8%  1.648e+09        vm-scalability.workload
 1.143e+09            +0.6%   1.15e+09 ±  2%      +2.9%  1.176e+09 ±  3%  cpuidle..time
   2464665 ±  3%      +2.0%    2515047 ±  4%      +2.2%    2519159 ±  8%  cpuidle..usage
    367.89            -0.2%     367.16            -0.2%     367.32        uptime.boot
      6393 ±  3%      -0.9%       6336 ±  2%      -0.5%       6363 ±  2%  uptime.idle
     59.33 ±  4%      -0.4%      59.06 ±  2%      -0.6%      58.94 ±  3%  boot-time.boot
     33.79 ±  3%      -0.8%      33.54            -0.7%      33.57        boot-time.dhcp
      5106 ±  4%      -0.6%       5076 ±  2%      -0.8%       5066 ±  3%  boot-time.idle
      1.05 ±  8%      -4.4%       1.01            -4.3%       1.01        boot-time.smp_boot
      3.78            -0.0        3.77 ±  3%      +0.1        3.91 ±  4%  mpstat.cpu.all.idle%
      0.00 ±184%      +0.0        0.00 ± 25%      -0.0        0.00 ± 60%  mpstat.cpu.all.iowait%
      2.58            +0.5        3.09 ±  3%      -0.0        2.56        mpstat.cpu.all.irq%
      0.03 ±  4%      +0.0        0.03 ±  8%      -0.0        0.03 ±  5%  mpstat.cpu.all.soft%
      4.06 ±  3%      +0.3        4.40 ±  3%      -0.1        3.98 ±  4%  mpstat.cpu.all.sys%
     89.55            -0.8       88.71            -0.0       89.52        mpstat.cpu.all.usr%
      0.00          -100.0%       0.00          -100.0%       0.00        numa-numastat.node0.interleave_hit
  14350133 ±  4%      +7.7%   15454129 ±  4%      -0.5%   14283646 ±  4%  numa-numastat.node0.local_node
  14405409 ±  4%      +7.5%   15487972 ±  4%      -0.5%   14332762 ±  4%  numa-numastat.node0.numa_hit
     55258 ± 48%     -37.3%      34622 ± 67%     -13.6%      47731 ± 51%  numa-numastat.node0.other_node
      0.00          -100.0%       0.00          -100.0%       0.00        numa-numastat.node1.interleave_hit
  14402027 ±  3%      +8.4%   15618857 ±  5%      -0.1%   14389667 ±  4%  numa-numastat.node1.local_node
  14433899 ±  3%      +8.6%   15670948 ±  5%      -0.0%   14429236 ±  4%  numa-numastat.node1.numa_hit
     31821 ± 84%     +64.9%      52467 ± 44%     +30.8%      41622 ± 56%  numa-numastat.node1.other_node
    305.38            -0.1%     305.19            -0.1%     305.15        time.elapsed_time
    305.38            -0.1%     305.19            -0.1%     305.15        time.elapsed_time.max
    652.11 ± 88%     +54.5%       1007 ± 63%     +45.4%     948.20 ± 80%  time.file_system_inputs
    200293 ±  3%      -4.3%     191707 ±  2%      +1.9%     204033 ±  3%  time.involuntary_context_switches
     67.11 ± 56%     -95.4%       3.11 ± 80%     -11.3%      59.50 ± 27%  time.major_page_faults
  32930133            -0.0%   32924571            -0.0%   32922758        time.maximum_resident_set_size
  67952989 ±  5%     +35.6%   92147668 ±  3%      +2.8%   69849921 ±  8%  time.minor_page_faults
      4096            +0.0%       4096            +0.0%       4096        time.page_size
      9006            -0.6%       8956            -0.0%       9005        time.percent_of_cpu_this_job_got
      1178 ±  3%      +8.6%       1278 ±  3%      -1.9%       1155 ±  4%  time.system_time
     26327            -1.0%      26056            +0.0%      26327        time.user_time
     11378 ±  5%    +118.5%      24867 ±  7%      -0.5%      11327 ±  9%  time.voluntary_context_switches
      4.00            +0.0%       4.00            +0.0%       4.00        vmstat.cpu.id
      6.00           +16.7%       7.00            +0.0%       6.00        vmstat.cpu.sy
     88.33            -0.9%      87.56            +0.3%      88.60        vmstat.cpu.us
      0.00          -100.0%       0.00          -100.0%       0.00        vmstat.cpu.wa
     10.67 ± 97%     -34.4%       7.00           -34.4%       7.00        vmstat.io.bi
      8.00 ± 70%     -25.0%       6.00           -25.0%       6.00        vmstat.io.bo
      1046            -0.1%       1045            -0.1%       1045        vmstat.memory.buff
   2964204            -0.1%    2962572            -0.1%    2961826        vmstat.memory.cache
  63650311            +0.1%   63687273            +0.1%   63731617        vmstat.memory.free
      0.00          -100.0%       0.00          -100.0%       0.00        vmstat.procs.b
     92.00            -0.2%      91.78            -0.3%      91.70        vmstat.procs.r
      2022 ±  3%      +3.6%       2095            -1.3%       1995        vmstat.system.cs
    539357 ±  2%     +32.9%     716886 ±  4%      -2.1%     528047 ±  5%  vmstat.system.in
    143480 ±  3%     -12.0%     126262 ±  4%      -0.6%     142665 ±  3%  sched_debug.cfs_rq:/.min_vruntime.stddev
    548123 ±  7%     -20.7%     434543 ±  9%      -5.5%     517900 ±  7%  sched_debug.cfs_rq:/.spread0.avg
    655329 ±  6%     -16.2%     549218 ±  6%      -4.7%     624275 ±  5%  sched_debug.cfs_rq:/.spread0.max
    143388 ±  3%     -11.9%     126295 ±  4%      -0.6%     142588 ±  3%  sched_debug.cfs_rq:/.spread0.stddev
    240478 ±  6%     -12.0%     211715 ±  5%      -3.2%     232667 ±  8%  sched_debug.cpu.avg_idle.avg
      1938 ±  5%     +11.4%       2160 ±  3%      -2.1%       1897 ±  4%  sched_debug.cpu.nr_switches.min
  39960890 ±  6%     +54.7%   61837739 ±  4%      +5.0%   41939453 ± 11%  proc-vmstat.numa_hint_faults
  19987976 ±  6%     +55.1%   30996483 ±  4%      +5.0%   20978472 ± 11%  proc-vmstat.numa_hint_faults_local
  28840932 ±  3%      +8.0%   31160418 ±  4%      -0.3%   28764186 ±  4%  proc-vmstat.numa_hit
  28753783 ±  3%      +8.1%   31074486 ±  4%      -0.3%   28675501 ±  4%  proc-vmstat.numa_local
  19745743 ±  5%     +11.8%   22080123 ±  6%      -0.4%   19668879 ±  6%  proc-vmstat.numa_pages_migrated
  40107839 ±  6%     +54.6%   61988683 ±  4%      +5.0%   42094380 ± 11%  proc-vmstat.numa_pte_updates
  37158989 ±  2%      +6.3%   39482935 ±  3%      -0.2%   37080293 ±  3%  proc-vmstat.pgalloc_normal
  68856116 ±  5%     +35.1%   93057570 ±  3%      +2.8%   70755839 ±  8%  proc-vmstat.pgfault
  19745743 ±  5%     +11.8%   22080123 ±  6%      -0.4%   19668879 ±  6%  proc-vmstat.pgmigrate_success
  19754280 ±  5%     +11.8%   22080663 ±  6%      -0.4%   19677784 ±  6%  proc-vmstat.pgreuse
   8953845 ±  3%     +13.3%   10142474 ±  2%      +0.7%    9013008 ±  2%  perf-stat.i.branch-misses
    158.09            +7.5%     170.00 ±  2%      +1.5%     160.38 ±  3%  perf-stat.i.cpu-migrations
      9.10            -0.1        8.97            -0.0        9.08        perf-stat.i.dTLB-store-miss-rate%
   2454429 ±  2%     +26.7%    3110501 ±  5%      -5.2%    2326293 ±  3%  perf-stat.i.iTLB-load-misses
      0.31 ± 38%     -68.9%       0.10 ± 31%     -11.2%       0.27 ± 22%  perf-stat.i.major-faults
    224958 ±  5%     +35.4%     304571 ±  3%      +2.7%     231063 ±  8%  perf-stat.i.minor-faults
    224959 ±  5%     +35.4%     304571 ±  3%      +2.7%     231064 ±  8%  perf-stat.i.page-faults
      0.08 ±  4%      +0.0        0.09 ±  3%      +0.0        0.08 ±  2%  perf-stat.overall.branch-miss-rate%
      9.38            -0.1        9.25            -0.0        9.37        perf-stat.overall.dTLB-store-miss-rate%
     95.49            +1.0       96.53            -0.3       95.15        perf-stat.overall.iTLB-load-miss-rate%
     20490 ±  3%     -21.5%      16077 ±  6%      +4.5%      21404 ±  4%  perf-stat.overall.instructions-per-iTLB-miss
   8906114 ±  3%     +13.3%   10090374 ±  2%      +0.7%    8968593 ±  2%  perf-stat.ps.branch-misses
    157.57            +7.6%     169.49 ±  2%      +1.4%     159.76 ±  3%  perf-stat.ps.cpu-migrations
   2444301 ±  2%     +26.8%    3098710 ±  5%      -5.2%    2317560 ±  3%  perf-stat.ps.iTLB-load-misses
      0.31 ± 38%     -68.8%       0.10 ± 31%     -10.8%       0.27 ± 22%  perf-stat.ps.major-faults
    224444 ±  5%     +35.3%     303619 ±  3%      +2.7%     230589 ±  8%  perf-stat.ps.minor-faults
    224444 ±  5%     +35.3%     303620 ±  3%      +2.7%     230589 ±  8%  perf-stat.ps.page-faults
      1.26 ± 15%      -1.3        0.00            -0.0        1.25 ± 14%  perf-profile.calltrace.cycles-pp.migrate_folio_unmap.migrate_pages_batch.migrate_pages.migrate_misplaced_page.do_numa_page
      1.14 ± 15%      -1.1        0.00            -0.0        1.12 ± 14%  perf-profile.calltrace.cycles-pp.try_to_migrate.migrate_folio_unmap.migrate_pages_batch.migrate_pages.migrate_misplaced_page
      1.12 ± 15%      -1.1        0.00            -0.0        1.11 ± 14%  perf-profile.calltrace.cycles-pp.rmap_walk_anon.try_to_migrate.migrate_folio_unmap.migrate_pages_batch.migrate_pages
      1.08 ± 15%      -1.1        0.00            -0.0        1.06 ± 14%  perf-profile.calltrace.cycles-pp.try_to_migrate_one.rmap_walk_anon.try_to_migrate.migrate_folio_unmap.migrate_pages_batch
      0.92 ± 15%      -0.9        0.00            -0.0        0.92 ± 14%  perf-profile.calltrace.cycles-pp.ptep_clear_flush.try_to_migrate_one.rmap_walk_anon.try_to_migrate.migrate_folio_unmap
      0.91 ± 15%      -0.9        0.00            -0.0        0.91 ± 14%  perf-profile.calltrace.cycles-pp.flush_tlb_mm_range.ptep_clear_flush.try_to_migrate_one.rmap_walk_anon.try_to_migrate
      0.91 ± 15%      -0.9        0.00            -0.0        0.91 ± 14%  perf-profile.calltrace.cycles-pp.on_each_cpu_cond_mask.flush_tlb_mm_range.ptep_clear_flush.try_to_migrate_one.rmap_walk_anon
      0.91 ± 15%      -0.9        0.00            -0.0        0.90 ± 14%  perf-profile.calltrace.cycles-pp.smp_call_function_many_cond.on_each_cpu_cond_mask.flush_tlb_mm_range.ptep_clear_flush.try_to_migrate_one
     72.48 ±  3%      -0.7       71.79            +2.8       75.24 ±  5%  perf-profile.calltrace.cycles-pp.do_access
      0.26 ±112%      -0.3        0.00            +0.1        0.34 ± 82%  perf-profile.calltrace.cycles-pp._raw_spin_lock.do_numa_page.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
      0.19 ±141%      -0.2        0.00            -0.0        0.16 ±153%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.do_numa_page.__handle_mm_fault.handle_mm_fault
      0.07 ±282%      -0.1        0.00            -0.1        0.00        perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.nrand48_r
      0.07 ±282%      -0.1        0.00            -0.1        0.00        perf-profile.calltrace.cycles-pp.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.nrand48_r
      0.07 ±282%      -0.1        0.00            -0.1        0.00        perf-profile.calltrace.cycles-pp.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.nrand48_r
      0.06 ±282%      -0.1        0.00            -0.1        0.00        perf-profile.calltrace.cycles-pp.rmap_walk_anon.migrate_pages_batch.migrate_pages.migrate_misplaced_page.do_numa_page
      0.13 ±188%      -0.0        0.11 ±187%      -0.1        0.00        perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.nrand48_r
      4.13 ±  3%      -0.0        4.12            -0.1        3.98 ±  6%  perf-profile.calltrace.cycles-pp.do_rw_once
      1.34 ± 39%      +0.0        1.35 ± 25%      -0.2        1.16 ± 22%  perf-profile.calltrace.cycles-pp.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
      0.55 ± 69%      +0.0        0.60 ± 56%      -0.1        0.50 ± 52%  perf-profile.calltrace.cycles-pp.scheduler_tick.update_process_times.tick_sched_handle.tick_sched_timer.__hrtimer_run_queues
      1.09 ± 31%      +0.1        1.14 ± 26%      -0.2        0.93 ± 37%  perf-profile.calltrace.cycles-pp.tick_sched_handle.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt
      1.08 ± 31%      +0.1        1.13 ± 26%      -0.2        0.92 ± 37%  perf-profile.calltrace.cycles-pp.update_process_times.tick_sched_handle.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt
      0.00            +0.1        0.06 ±282%      +0.0        0.00        perf-profile.calltrace.cycles-pp.move_to_new_folio.migrate_pages_batch.migrate_pages.migrate_misplaced_page.do_numa_page
      0.00            +0.1        0.06 ±282%      +0.0        0.00        perf-profile.calltrace.cycles-pp.migrate_folio_extra.move_to_new_folio.migrate_pages_batch.migrate_pages.migrate_misplaced_page
      1.18 ± 30%      +0.1        1.24 ± 26%      -0.1        1.07 ± 23%  perf-profile.calltrace.cycles-pp.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt
      1.52 ± 28%      +0.1        1.58 ± 25%      -0.2        1.36 ± 21%  perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.do_access
      1.43 ± 29%      +0.1        1.50 ± 25%      -0.1        1.29 ± 21%  perf-profile.calltrace.cycles-pp.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.do_access
      1.44 ± 28%      +0.1        1.51 ± 25%      -0.1        1.30 ± 21%  perf-profile.calltrace.cycles-pp.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.do_access
      1.72 ± 25%      +0.1        1.80 ± 22%      -0.2        1.55 ± 20%  perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.do_access
      6.40 ±  9%      +0.1        6.54            -0.6        5.76 ± 17%  perf-profile.calltrace.cycles-pp.lrand48_r
      0.17 ±196%      +0.2        0.33 ± 89%      -0.1        0.11 ±200%  perf-profile.calltrace.cycles-pp.task_tick_fair.scheduler_tick.update_process_times.tick_sched_handle.tick_sched_timer
      0.00            +0.3        0.26 ±113%      +0.0        0.00        perf-profile.calltrace.cycles-pp.__flush_smp_call_function_queue.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function.do_access
      0.00            +0.3        0.26 ±113%      +0.0        0.00        perf-profile.calltrace.cycles-pp.__sysvec_call_function.sysvec_call_function.asm_sysvec_call_function.do_access
      0.00            +0.3        0.33 ± 91%      +0.0        0.00        perf-profile.calltrace.cycles-pp.sysvec_call_function.asm_sysvec_call_function.do_access
     19.08 ± 10%      +0.5       19.59            -2.2       16.90 ± 19%  perf-profile.calltrace.cycles-pp.nrand48_r
      0.00            +0.6        0.59 ± 40%      +0.0        0.00        perf-profile.calltrace.cycles-pp.asm_sysvec_call_function.do_access
      3.30 ± 15%      +0.9        4.18 ± 19%      -0.1        3.24 ± 14%  perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
      3.34 ± 15%      +0.9        4.22 ± 19%      -0.1        3.27 ± 14%  perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.do_access
      0.00            +0.9        0.90 ± 18%      +0.0        0.00        perf-profile.calltrace.cycles-pp.llist_add_batch.smp_call_function_many_cond.on_each_cpu_cond_mask.arch_tlbbatch_flush.try_to_unmap_flush
      3.70 ± 15%      +0.9        4.64 ± 19%      -0.1        3.60 ± 14%  perf-profile.calltrace.cycles-pp.exc_page_fault.asm_exc_page_fault.do_access
      3.68 ± 15%      +0.9        4.63 ± 19%      -0.1        3.59 ± 14%  perf-profile.calltrace.cycles-pp.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.do_access
      3.89 ± 14%      +1.0        4.85 ± 19%      -0.1        3.76 ± 14%  perf-profile.calltrace.cycles-pp.asm_exc_page_fault.do_access
      3.03 ± 15%      +1.0        4.03 ± 19%      -0.1        2.98 ± 14%  perf-profile.calltrace.cycles-pp.do_numa_page.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
      2.46 ± 15%      +1.4        3.85 ± 19%      -0.1        2.41 ± 14%  perf-profile.calltrace.cycles-pp.migrate_misplaced_page.do_numa_page.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
      2.27 ± 15%      +1.4        3.67 ± 19%      -0.0        2.22 ± 14%  perf-profile.calltrace.cycles-pp.migrate_pages_batch.migrate_pages.migrate_misplaced_page.do_numa_page.__handle_mm_fault
      2.27 ± 15%      +1.4        3.68 ± 19%      -0.0        2.23 ± 14%  perf-profile.calltrace.cycles-pp.migrate_pages.migrate_misplaced_page.do_numa_page.__handle_mm_fault.handle_mm_fault
      0.00            +2.4        2.38 ± 18%      +0.0        0.00        perf-profile.calltrace.cycles-pp.smp_call_function_many_cond.on_each_cpu_cond_mask.arch_tlbbatch_flush.try_to_unmap_flush.migrate_pages_batch
      0.00            +2.4        2.40 ± 18%      +0.0        0.00        perf-profile.calltrace.cycles-pp.on_each_cpu_cond_mask.arch_tlbbatch_flush.try_to_unmap_flush.migrate_pages_batch.migrate_pages
      0.00            +2.4        2.40 ± 18%      +0.0        0.00        perf-profile.calltrace.cycles-pp.try_to_unmap_flush.migrate_pages_batch.migrate_pages.migrate_misplaced_page.do_numa_page
      0.00            +2.4        2.40 ± 18%      +0.0        0.00        perf-profile.calltrace.cycles-pp.arch_tlbbatch_flush.try_to_unmap_flush.migrate_pages_batch.migrate_pages.migrate_misplaced_page
      1.51 ± 16%      -1.2        0.31 ± 20%      -0.0        1.48 ± 14%  perf-profile.children.cycles-pp.rmap_walk_anon
      1.25 ± 16%      -1.0        0.29 ± 20%      -0.0        1.22 ± 15%  perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
      1.08 ± 15%      -1.0        0.12 ± 21%      -0.0        1.06 ± 14%  perf-profile.children.cycles-pp.try_to_migrate_one
      1.14 ± 15%      -0.9        0.19 ± 19%      -0.0        1.12 ± 14%  perf-profile.children.cycles-pp.try_to_migrate
      0.92 ± 15%      -0.9        0.00            -0.0        0.92 ± 14%  perf-profile.children.cycles-pp.ptep_clear_flush
      1.26 ± 15%      -0.9        0.34 ± 21%      -0.0        1.25 ± 14%  perf-profile.children.cycles-pp.migrate_folio_unmap
      0.92 ± 15%      -0.9        0.00            -0.0        0.91 ± 14%  perf-profile.children.cycles-pp.flush_tlb_mm_range
      1.05 ± 15%      -0.9        0.16 ± 16%      -0.0        1.04 ± 15%  perf-profile.children.cycles-pp._raw_spin_lock
     72.83 ±  3%      -0.6       72.25            +2.8       75.59 ±  5%  perf-profile.children.cycles-pp.do_access
      0.46 ± 15%      -0.3        0.11 ± 20%      -0.0        0.44 ± 14%  perf-profile.children.cycles-pp.page_vma_mapped_walk
      0.34 ± 15%      -0.3        0.08 ± 18%      -0.0        0.33 ± 15%  perf-profile.children.cycles-pp.remove_migration_pte
      0.14 ± 16%      -0.1        0.00            -0.0        0.14 ± 17%  perf-profile.children.cycles-pp.handle_pte_fault
      0.13 ± 22%      -0.0        0.09 ± 23%      -0.0        0.12 ± 17%  perf-profile.children.cycles-pp.folio_lruvec_lock_irq
      0.13 ± 22%      -0.0        0.09 ± 22%      -0.0        0.12 ± 18%  perf-profile.children.cycles-pp._raw_spin_lock_irq
      0.09 ± 39%      -0.0        0.07 ± 75%      -0.0        0.09 ± 52%  perf-profile.children.cycles-pp.cpuacct_account_field
      0.17 ± 21%      -0.0        0.15 ± 21%      -0.0        0.16 ± 15%  perf-profile.children.cycles-pp.folio_isolate_lru
      0.19 ± 20%      -0.0        0.17 ± 20%      -0.0        0.18 ± 15%  perf-profile.children.cycles-pp.numamigrate_isolate_page
      0.12 ± 95%      -0.0        0.11 ± 16%      -0.1        0.06 ± 13%  perf-profile.children.cycles-pp.irqentry_exit_to_user_mode
      0.09 ± 47%      -0.0        0.08 ± 43%      -0.0        0.06 ± 38%  perf-profile.children.cycles-pp.hrtimer_active
      4.37 ±  3%      -0.0        4.36            -0.2        4.22 ±  5%  perf-profile.children.cycles-pp.do_rw_once
      0.33 ±  2%      -0.0        0.32 ±  2%      -0.0        0.32 ±  5%  perf-profile.children.cycles-pp.lrand48_r@plt
      0.01 ±282%      -0.0        0.00            -0.0        0.00        perf-profile.children.cycles-pp.enqueue_hrtimer
      0.01 ±282%      -0.0        0.00            -0.0        0.00        perf-profile.children.cycles-pp.timerqueue_add
      0.06 ± 13%      -0.0        0.05 ± 37%      -0.0        0.04 ± 51%  perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
      0.06 ± 13%      -0.0        0.05 ± 37%      -0.0        0.04 ± 51%  perf-profile.children.cycles-pp.do_syscall_64
      0.01 ±282%      -0.0        0.00            -0.0        0.00        perf-profile.children.cycles-pp.lapic_next_deadline
      0.01 ±282%      -0.0        0.00            -0.0        0.00        perf-profile.children.cycles-pp.hrtimer_update_next_event
      0.01 ±282%      -0.0        0.00            -0.0        0.00        perf-profile.children.cycles-pp.update_min_vruntime
      0.01 ±282%      -0.0        0.00            -0.0        0.00        perf-profile.children.cycles-pp.rcu_core
      0.15 ± 20%      -0.0        0.15 ± 21%      -0.0        0.14 ± 17%  perf-profile.children.cycles-pp.folio_lruvec_lock_irqsave
      0.07 ± 27%      -0.0        0.06 ± 55%      -0.0        0.05 ± 53%  perf-profile.children.cycles-pp.ktime_get
      0.01 ±193%      -0.0        0.01 ±188%      -0.0        0.01 ±201%  perf-profile.children.cycles-pp.perf_mux_hrtimer_handler
      0.01 ±282%      -0.0        0.01 ±282%      -0.0        0.01 ±299%  perf-profile.children.cycles-pp.perf_rotate_context
      0.21 ± 17%      -0.0        0.21 ± 18%      -0.0        0.20 ± 15%  perf-profile.children.cycles-pp._raw_spin_lock_irqsave
      0.02 ±209%      -0.0        0.02 ±142%      -0.0        0.01 ±300%  perf-profile.children.cycles-pp.update_cfs_group
      0.05 ± 43%      -0.0        0.05 ± 57%      -0.0        0.04 ± 67%  perf-profile.children.cycles-pp.update_irq_load_avg
      0.00            +0.0        0.00            +0.0        0.02 ±300%  perf-profile.children.cycles-pp.secondary_startup_64_no_verify
      0.00            +0.0        0.00            +0.0        0.02 ±300%  perf-profile.children.cycles-pp.start_secondary
      0.00            +0.0        0.00            +0.0        0.02 ±300%  perf-profile.children.cycles-pp.cpu_startup_entry
      0.00            +0.0        0.00            +0.0        0.02 ±300%  perf-profile.children.cycles-pp.do_idle
      0.00            +0.0        0.00            +0.0        0.02 ±300%  perf-profile.children.cycles-pp.cpuidle_idle_call
      0.00            +0.0        0.00            +0.0        0.02 ±300%  perf-profile.children.cycles-pp.cpuidle_enter
      0.00            +0.0        0.00            +0.0        0.02 ±300%  perf-profile.children.cycles-pp.cpuidle_enter_state
      0.00            +0.0        0.00            +0.0        0.02 ±300%  perf-profile.children.cycles-pp.mwait_idle_with_hints
      0.00            +0.0        0.00            +0.0        0.02 ±300%  perf-profile.children.cycles-pp.intel_idle
      0.06 ± 18%      +0.0        0.07 ± 41%      -0.0        0.05 ± 66%  perf-profile.children.cycles-pp.rcu_pending
      0.02 ±112%      +0.0        0.03 ±111%      -0.0        0.01 ±300%  perf-profile.children.cycles-pp.timerqueue_del
      0.02 ±111%      +0.0        0.03 ±112%      +0.0        0.03 ±100%  perf-profile.children.cycles-pp.irqtime_account_process_tick
      0.06 ± 18%      +0.0        0.06 ± 19%      -0.0        0.03 ± 82%  perf-profile.children.cycles-pp.mt_find
      0.07 ± 39%      +0.0        0.07 ± 28%      -0.0        0.05 ± 55%  perf-profile.children.cycles-pp.ktime_get_update_offsets_now
      0.00            +0.0        0.01 ±282%      +0.0        0.00        perf-profile.children.cycles-pp._find_next_bit
      0.00            +0.0        0.01 ±282%      +0.0        0.00        perf-profile.children.cycles-pp.folio_get_anon_vma
      0.00            +0.0        0.01 ±282%      +0.0        0.00        perf-profile.children.cycles-pp.__free_one_page
      0.06 ± 18%      +0.0        0.06 ± 20%      -0.0        0.03 ± 82%  perf-profile.children.cycles-pp.find_vma
      0.11 ± 25%      +0.0        0.11 ± 25%      -0.0        0.09 ± 38%  perf-profile.children.cycles-pp.update_rq_clock
      0.32 ± 19%      +0.0        0.33 ± 32%      -0.0        0.30 ± 31%  perf-profile.children.cycles-pp.account_user_time
      0.21 ± 48%      +0.0        0.22 ± 28%      -0.0        0.18 ± 23%  perf-profile.children.cycles-pp.update_load_avg
      0.09 ± 20%      +0.0        0.09 ± 23%      -0.0        0.08 ± 38%  perf-profile.children.cycles-pp.tick_sched_do_timer
      0.02 ±154%      +0.0        0.03 ± 92%      -0.0        0.02 ±155%  perf-profile.children.cycles-pp.__do_softirq
      0.07 ± 35%      +0.0        0.08 ± 26%      -0.0        0.07 ± 20%  perf-profile.children.cycles-pp.clockevents_program_event
      0.08 ± 36%      +0.0        0.09 ± 24%      -0.0        0.07 ± 19%  perf-profile.children.cycles-pp.__irq_exit_rcu
      0.03 ±127%      +0.0        0.04 ± 72%      -0.0        0.02 ±123%  perf-profile.children.cycles-pp.__update_load_avg_cfs_rq
      0.08 ± 18%      +0.0        0.09 ± 26%      -0.0        0.06 ± 53%  perf-profile.children.cycles-pp.rcu_sched_clock_irq
      0.00            +0.0        0.01 ±187%      +0.0        0.00        perf-profile.children.cycles-pp.lru_add_fn
      0.21 ± 19%      +0.0        0.22 ± 21%      -0.0        0.20 ± 15%  perf-profile.children.cycles-pp.folio_batch_move_lru
      0.21 ± 19%      +0.0        0.22 ± 20%      -0.0        0.20 ± 15%  perf-profile.children.cycles-pp.lru_add_drain
      0.21 ± 19%      +0.0        0.22 ± 20%      -0.0        0.20 ± 15%  perf-profile.children.cycles-pp.lru_add_drain_cpu
      0.06 ± 39%      +0.0        0.07 ± 21%      +0.0        0.06 ± 15%  perf-profile.children.cycles-pp.rmqueue_bulk
      0.06 ± 16%      +0.0        0.08 ± 21%      -0.0        0.06 ± 13%  perf-profile.children.cycles-pp.free_unref_page
      0.09 ± 16%      +0.0        0.11 ± 22%      -0.0        0.09 ± 14%  perf-profile.children.cycles-pp.__alloc_pages
      0.09 ± 15%      +0.0        0.11 ± 21%      -0.0        0.09 ± 17%  perf-profile.children.cycles-pp.rmqueue
      0.09 ± 16%      +0.0        0.11 ± 21%      -0.0        0.09 ± 14%  perf-profile.children.cycles-pp.get_page_from_freelist
      0.03 ± 71%      +0.0        0.05 ± 39%      -0.0        0.03 ± 82%  perf-profile.children.cycles-pp.free_pcppages_bulk
      0.00            +0.0        0.02 ±142%      +0.0        0.00        perf-profile.children.cycles-pp.can_change_pte_writable
      0.00            +0.0        0.02 ±142%      +0.0        0.00        perf-profile.children.cycles-pp.folio_migrate_flags
      0.03 ±152%      +0.0        0.04 ± 72%      +0.0        0.03 ± 84%  perf-profile.children.cycles-pp.__update_load_avg_se
      0.09 ± 18%      +0.0        0.11 ± 22%      -0.0        0.09 ± 14%  perf-profile.children.cycles-pp.__folio_alloc
      0.09 ± 18%      +0.0        0.11 ± 22%      +0.0        0.09 ± 16%  perf-profile.children.cycles-pp.alloc_misplaced_dst_page
      0.08 ± 15%      +0.0        0.10 ± 21%      -0.0        0.08 ± 16%  perf-profile.children.cycles-pp.__list_del_entry_valid
      0.04 ± 91%      +0.0        0.06 ± 38%      +0.0        0.04 ± 66%  perf-profile.children.cycles-pp.arch_scale_freq_tick
      0.11 ± 16%      +0.0        0.13 ± 29%      -0.0        0.10 ± 28%  perf-profile.children.cycles-pp.__cgroup_account_cputime_field
      0.19 ± 17%      +0.0        0.21 ± 18%      -0.0        0.18 ± 18%  perf-profile.children.cycles-pp.down_read_trylock
      0.03 ±118%      +0.0        0.05 ± 59%      -0.0        0.03 ±101%  perf-profile.children.cycles-pp.perf_trace_sched_stat_runtime
      0.02 ±142%      +0.0        0.04 ± 72%      -0.0        0.01 ±299%  perf-profile.children.cycles-pp.irqtime_account_irq
      0.25 ± 39%      +0.0        0.27 ± 25%      -0.0        0.22 ± 22%  perf-profile.children.cycles-pp.update_curr
      0.09 ±  7%      +0.0        0.11 ± 14%      -0.0        0.08 ± 15%  perf-profile.children.cycles-pp.sync_regs
      0.16 ± 13%      +0.0        0.18 ± 19%      -0.0        0.15 ± 14%  perf-profile.children.cycles-pp.up_read
      0.68 ± 45%      +0.0        0.71 ± 28%      -0.1        0.58 ± 24%  perf-profile.children.cycles-pp.task_tick_fair
      0.02 ±141%      +0.0        0.05 ± 42%      +0.0        0.02 ±122%  perf-profile.children.cycles-pp.uncharge_batch
      0.01 ±282%      +0.0        0.04 ± 75%      +0.0        0.01 ±200%  perf-profile.children.cycles-pp.page_counter_uncharge
      0.02 ±141%      +0.0        0.06 ± 44%      +0.0        0.02 ±100%  perf-profile.children.cycles-pp.__mem_cgroup_uncharge
      0.02 ±141%      +0.0        0.06 ± 44%      +0.0        0.02 ±100%  perf-profile.children.cycles-pp.__folio_put
      0.03 ± 71%      +0.0        0.08 ± 25%      -0.0        0.03 ± 82%  perf-profile.children.cycles-pp.mem_cgroup_migrate
      0.96 ± 40%      +0.0        1.00 ± 27%      -0.1        0.81 ± 24%  perf-profile.children.cycles-pp.scheduler_tick
      0.11 ± 20%      +0.0        0.16 ± 15%      -0.0        0.11 ± 11%  perf-profile.children.cycles-pp.native_irq_return_iret
      0.06 ± 13%      +0.1        0.11 ± 16%      -0.0        0.06 ± 13%  perf-profile.children.cycles-pp.exit_to_user_mode_prepare
      0.05 ± 36%      +0.1        0.10 ± 18%      -0.0        0.04 ± 51%  perf-profile.children.cycles-pp.exit_to_user_mode_loop
      0.01 ±187%      +0.1        0.07 ± 26%      +0.0        0.02 ±122%  perf-profile.children.cycles-pp.page_counter_charge
      0.04 ± 71%      +0.1        0.10 ± 18%      +0.0        0.04 ± 51%  perf-profile.children.cycles-pp.task_work_run
      0.17 ± 14%      +0.1        0.23 ± 20%      -0.0        0.17 ± 15%  perf-profile.children.cycles-pp.copy_page
      0.17 ± 13%      +0.1        0.24 ± 19%      -0.0        0.17 ± 15%  perf-profile.children.cycles-pp.folio_copy
      0.03 ± 90%      +0.1        0.10 ± 16%      +0.0        0.04 ± 51%  perf-profile.children.cycles-pp.change_pte_range
      0.03 ± 90%      +0.1        0.10 ± 18%      +0.0        0.04 ± 51%  perf-profile.children.cycles-pp.task_numa_work
      0.03 ± 90%      +0.1        0.10 ± 18%      +0.0        0.04 ± 51%  perf-profile.children.cycles-pp.change_prot_numa
      0.03 ± 90%      +0.1        0.10 ± 18%      +0.0        0.04 ± 51%  perf-profile.children.cycles-pp.change_protection_range
      0.03 ± 90%      +0.1        0.10 ± 18%      +0.0        0.04 ± 51%  perf-profile.children.cycles-pp.change_pmd_range
      0.06 ± 40%      +0.1        0.13 ± 23%      +0.0        0.06 ± 15%  perf-profile.children.cycles-pp.__default_send_IPI_dest_field
      1.58 ± 32%      +0.1        1.65 ± 25%      -0.2        1.36 ± 25%  perf-profile.children.cycles-pp.tick_sched_handle
      1.56 ± 32%      +0.1        1.64 ± 25%      -0.2        1.35 ± 25%  perf-profile.children.cycles-pp.update_process_times
      1.85 ± 30%      +0.1        1.94 ± 25%      -0.2        1.61 ± 24%  perf-profile.children.cycles-pp.__hrtimer_run_queues
      1.71 ± 31%      +0.1        1.79 ± 25%      -0.2        1.49 ± 25%  perf-profile.children.cycles-pp.tick_sched_timer
      0.08 ± 16%      +0.1        0.17 ± 21%      -0.0        0.08 ± 17%  perf-profile.children.cycles-pp.default_send_IPI_mask_sequence_phys
      2.09 ± 29%      +0.1        2.18 ± 24%      -0.3        1.81 ± 23%  perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
      2.06 ± 29%      +0.1        2.16 ± 24%      -0.3        1.79 ± 23%  perf-profile.children.cycles-pp.hrtimer_interrupt
      2.19 ± 29%      +0.1        2.29 ± 24%      -0.3        1.89 ± 23%  perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
      0.25 ± 12%      +0.1        0.36 ± 20%      -0.0        0.25 ± 14%  perf-profile.children.cycles-pp.move_to_new_folio
      0.25 ± 12%      +0.1        0.36 ± 20%      -0.0        0.25 ± 14%  perf-profile.children.cycles-pp.migrate_folio_extra
      0.00            +0.1        0.12 ± 22%      +0.0        0.00        perf-profile.children.cycles-pp.native_flush_tlb_local
      2.48 ± 26%      +0.1        2.60 ± 22%      -0.3        2.14 ± 22%  perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
      9.27 ±  8%      +0.2        9.45            -0.9        8.41 ± 15%  perf-profile.children.cycles-pp.lrand48_r
      0.25 ± 14%      +0.3        0.55 ± 18%      -0.0        0.24 ± 15%  perf-profile.children.cycles-pp.llist_reverse_order
      0.09 ± 17%      +0.4        0.45 ± 21%      +0.0        0.09 ± 12%  perf-profile.children.cycles-pp.flush_tlb_func
     16.69 ± 10%      +0.5       17.16            -2.0       14.72 ± 19%  perf-profile.children.cycles-pp.nrand48_r
      0.40 ± 15%      +0.5        0.93 ± 18%      -0.0        0.39 ± 14%  perf-profile.children.cycles-pp.llist_add_batch
      0.41 ± 14%      +0.7        1.14 ± 19%      -0.0        0.41 ± 14%  perf-profile.children.cycles-pp.__sysvec_call_function
      0.41 ± 14%      +0.7        1.14 ± 19%      -0.0        0.41 ± 14%  perf-profile.children.cycles-pp.__flush_smp_call_function_queue
      0.43 ± 14%      +0.7        1.17 ± 19%      -0.0        0.42 ± 14%  perf-profile.children.cycles-pp.sysvec_call_function
      0.55 ± 12%      +0.9        1.40 ± 19%      -0.0        0.53 ± 15%  perf-profile.children.cycles-pp.asm_sysvec_call_function
      3.31 ± 15%      +0.9        4.19 ± 19%      -0.1        3.24 ± 14%  perf-profile.children.cycles-pp.__handle_mm_fault
      3.34 ± 15%      +0.9        4.23 ± 19%      -0.1        3.27 ± 14%  perf-profile.children.cycles-pp.handle_mm_fault
      3.70 ± 15%      +0.9        4.64 ± 19%      -0.1        3.60 ± 14%  perf-profile.children.cycles-pp.exc_page_fault
      3.70 ± 15%      +0.9        4.64 ± 19%      -0.1        3.60 ± 14%  perf-profile.children.cycles-pp.do_user_addr_fault
      3.91 ± 14%      +1.0        4.88 ± 19%      -0.1        3.78 ± 14%  perf-profile.children.cycles-pp.asm_exc_page_fault
      3.03 ± 15%      +1.0        4.03 ± 19%      -0.1        2.98 ± 14%  perf-profile.children.cycles-pp.do_numa_page
      2.46 ± 15%      +1.4        3.85 ± 19%      -0.1        2.41 ± 14%  perf-profile.children.cycles-pp.migrate_misplaced_page
      2.27 ± 15%      +1.4        3.67 ± 19%      -0.0        2.22 ± 14%  perf-profile.children.cycles-pp.migrate_pages_batch
      2.27 ± 15%      +1.4        3.68 ± 19%      -0.0        2.23 ± 14%  perf-profile.children.cycles-pp.migrate_pages
      0.91 ± 15%      +1.5        2.42 ± 18%      -0.0        0.91 ± 14%  perf-profile.children.cycles-pp.smp_call_function_many_cond
      0.91 ± 15%      +1.5        2.42 ± 18%      -0.0        0.91 ± 14%  perf-profile.children.cycles-pp.on_each_cpu_cond_mask
      0.00            +2.4        2.40 ± 18%      +0.0        0.00        perf-profile.children.cycles-pp.try_to_unmap_flush
      0.00            +2.4        2.40 ± 18%      +0.0        0.00        perf-profile.children.cycles-pp.arch_tlbbatch_flush
     66.95 ±  3%      -2.0       64.95            +3.1       70.02 ±  6%  perf-profile.self.cycles-pp.do_access
      1.14 ± 16%      -0.9        0.28 ± 21%      -0.0        1.12 ± 15%  perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
      0.06 ±187%      -0.1        0.00            -0.1        0.00        perf-profile.self.cycles-pp.irqentry_exit_to_user_mode
      4.08 ±  3%      -0.0        4.03            -0.1        3.94 ±  5%  perf-profile.self.cycles-pp.do_rw_once
      0.09 ± 39%      -0.0        0.07 ± 75%      -0.0        0.09 ± 52%  perf-profile.self.cycles-pp.cpuacct_account_field
      0.06 ± 14%      -0.0        0.04 ± 72%      -0.0        0.03 ± 82%  perf-profile.self.cycles-pp.mt_find
      0.01 ±282%      -0.0        0.00            -0.0        0.00        perf-profile.self.cycles-pp.lapic_next_deadline
      0.01 ±188%      -0.0        0.01 ±282%      -0.0        0.01 ±200%  perf-profile.self.cycles-pp.rmap_walk_anon
      0.01 ±282%      -0.0        0.00            -0.0        0.00        perf-profile.self.cycles-pp.update_min_vruntime
      0.08 ± 47%      -0.0        0.07 ± 45%      -0.0        0.06 ± 38%  perf-profile.self.cycles-pp.hrtimer_active
      0.29 ±  4%      -0.0        0.28 ±  2%      -0.0        0.28 ±  6%  perf-profile.self.cycles-pp.lrand48_r@plt
      0.06 ± 49%      -0.0        0.06 ± 56%      -0.0        0.05 ± 52%  perf-profile.self.cycles-pp.scheduler_tick
      0.02 ±209%      -0.0        0.02 ±142%      -0.0        0.01 ±300%  perf-profile.self.cycles-pp.update_cfs_group
      0.05 ± 43%      -0.0        0.05 ± 57%      -0.0        0.04 ± 67%  perf-profile.self.cycles-pp.update_irq_load_avg
      0.11 ± 49%      -0.0        0.11 ± 29%      -0.0        0.09 ± 23%  perf-profile.self.cycles-pp.update_load_avg
      0.09 ± 41%      +0.0        0.09 ± 42%      -0.0        0.08 ± 24%  perf-profile.self.cycles-pp.task_tick_fair
      0.00            +0.0        0.00            +0.0        0.02 ±300%  perf-profile.self.cycles-pp.mwait_idle_with_hints
      0.12 ± 27%      +0.0        0.13 ± 36%      -0.0        0.10 ± 45%  perf-profile.self.cycles-pp.account_user_time
      0.11 ± 17%      +0.0        0.12 ± 20%      -0.0        0.10 ± 15%  perf-profile.self.cycles-pp.__handle_mm_fault
      0.02 ±111%      +0.0        0.03 ±112%      +0.0        0.03 ±100%  perf-profile.self.cycles-pp.irqtime_account_process_tick
      0.06 ± 55%      +0.0        0.06 ± 42%      -0.0        0.04 ± 84%  perf-profile.self.cycles-pp.ktime_get_update_offsets_now
      0.08 ± 17%      +0.0        0.08 ± 21%      -0.0        0.07 ± 15%  perf-profile.self.cycles-pp.page_vma_mapped_walk
      0.00            +0.0        0.01 ±282%      +0.0        0.00        perf-profile.self.cycles-pp.__free_one_page
      0.02 ±141%      +0.0        0.02 ±112%      -0.0        0.01 ±300%  perf-profile.self.cycles-pp.hrtimer_interrupt
      0.06 ± 42%      +0.0        0.07 ± 43%      -0.0        0.06 ± 37%  perf-profile.self.cycles-pp.update_process_times
      0.07 ± 16%      +0.0        0.08 ± 25%      -0.0        0.07 ± 38%  perf-profile.self.cycles-pp.tick_sched_do_timer
      0.00            +0.0        0.01 ±187%      +0.0        0.00        perf-profile.self.cycles-pp.can_change_pte_writable
      0.00            +0.0        0.01 ±187%      +0.0        0.00        perf-profile.self.cycles-pp.folio_migrate_flags
      0.00            +0.0        0.01 ±188%      +0.0        0.00        perf-profile.self.cycles-pp.try_to_migrate_one
      0.02 ±191%      +0.0        0.03 ± 90%      -0.0        0.01 ±200%  perf-profile.self.cycles-pp.__update_load_avg_se
      0.19 ± 16%      +0.0        0.20 ± 19%      -0.0        0.17 ± 17%  perf-profile.self.cycles-pp.down_read_trylock
      0.03 ±113%      +0.0        0.04 ± 71%      -0.0        0.03 ±100%  perf-profile.self.cycles-pp.update_rq_clock
      0.09 ± 14%      +0.0        0.10 ± 16%      -0.0        0.08 ± 17%  perf-profile.self.cycles-pp._raw_spin_lock
      0.03 ±151%      +0.0        0.04 ± 72%      -0.0        0.02 ±123%  perf-profile.self.cycles-pp.__update_load_avg_cfs_rq
      0.01 ±282%      +0.0        0.02 ±112%      -0.0        0.00        perf-profile.self.cycles-pp._raw_spin_lock_irqsave
      0.01 ±282%      +0.0        0.02 ±112%      -0.0        0.00        perf-profile.self.cycles-pp.rcu_pending
      0.10 ± 16%      +0.0        0.12 ± 29%      -0.0        0.10 ± 26%  perf-profile.self.cycles-pp.__cgroup_account_cputime_field
      0.01 ±282%      +0.0        0.02 ±112%      -0.0        0.01 ±300%  perf-profile.self.cycles-pp.__hrtimer_run_queues
      0.16 ± 41%      +0.0        0.18 ± 25%      -0.0        0.14 ± 24%  perf-profile.self.cycles-pp.update_curr
      0.15 ± 14%      +0.0        0.17 ± 21%      -0.0        0.14 ± 15%  perf-profile.self.cycles-pp.up_read
      0.04 ± 94%      +0.0        0.05 ± 56%      -0.0        0.03 ±100%  perf-profile.self.cycles-pp.ktime_get
      0.07 ± 16%      +0.0        0.10 ± 21%      +0.0        0.08 ± 16%  perf-profile.self.cycles-pp.__list_del_entry_valid
      0.04 ± 91%      +0.0        0.06 ± 38%      +0.0        0.04 ± 66%  perf-profile.self.cycles-pp.arch_scale_freq_tick
      0.03 ±118%      +0.0        0.05 ± 59%      -0.0        0.03 ±101%  perf-profile.self.cycles-pp.perf_trace_sched_stat_runtime
      0.00            +0.0        0.03 ±113%      +0.0        0.00        perf-profile.self.cycles-pp.page_counter_uncharge
      0.09 ±  7%      +0.0        0.11 ± 14%      -0.0        0.08 ± 15%  perf-profile.self.cycles-pp.sync_regs
      0.02 ±111%      +0.0        0.06 ± 15%      -0.0        0.02 ±152%  perf-profile.self.cycles-pp.change_pte_range
      0.11 ± 20%      +0.0        0.15 ± 15%      -0.0        0.11 ± 11%  perf-profile.self.cycles-pp.native_irq_return_iret
      0.01 ±282%      +0.1        0.06 ± 43%      -0.0        0.01 ±299%  perf-profile.self.cycles-pp.page_counter_charge
      0.16 ± 15%      +0.1        0.22 ± 21%      -0.0        0.16 ± 16%  perf-profile.self.cycles-pp.copy_page
      0.06 ± 40%      +0.1        0.13 ± 23%      +0.0        0.06 ± 15%  perf-profile.self.cycles-pp.__default_send_IPI_dest_field
      0.07 ± 15%      +0.1        0.16 ± 18%      +0.0        0.07 ± 15%  perf-profile.self.cycles-pp.__flush_smp_call_function_queue
      0.00            +0.1        0.11 ± 19%      +0.0        0.00        perf-profile.self.cycles-pp.native_flush_tlb_local
      8.81 ±  9%      +0.1        8.94 ±  2%      -0.8        7.99 ± 16%  perf-profile.self.cycles-pp.lrand48_r
      0.06 ± 16%      +0.3        0.33 ± 21%      -0.0        0.06 ± 36%  perf-profile.self.cycles-pp.flush_tlb_func
      0.25 ± 14%      +0.3        0.55 ± 18%      -0.0        0.24 ± 15%  perf-profile.self.cycles-pp.llist_reverse_order
     13.38 ± 11%      +0.3       13.71            -1.7       11.73 ± 21%  perf-profile.self.cycles-pp.nrand48_r
      0.35 ± 15%      +0.4        0.76 ± 18%      -0.0        0.34 ± 13%  perf-profile.self.cycles-pp.llist_add_batch
      0.37 ± 17%      +0.7        1.10 ± 18%      +0.0        0.38 ± 15%  perf-profile.self.cycles-pp.smp_call_function_many_cond

--
Best Regards,
Yujie


> Best Regards,
> Huang, Ying
> 
> ---------------------------8<-----------------------------------------
> From b36b662c80652447d7374faff1142a941dc9d617 Mon Sep 17 00:00:00 2001
> From: Huang Ying <ying.huang@xxxxxxxxx>
> Date: Mon, 20 Mar 2023 15:38:12 +0800
> Subject: [PATCH] dbg, migrate_pages: don't batch flushing for single page
>  migration
> 
> ---
>  mm/migrate.c | 12 +++++++-----
>  1 file changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 98f1c11197a8..7271209c1a03 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1113,8 +1113,8 @@ static void migrate_folio_done(struct folio *src,
>  static int migrate_folio_unmap(new_page_t get_new_page, free_page_t put_new_page,
>                                unsigned long private, struct folio *src,
>                                struct folio **dstp, int force, bool avoid_force_lock,
> -                              enum migrate_mode mode, enum migrate_reason reason,
> -                              struct list_head *ret)
> +                              bool batch_flush, enum migrate_mode mode,
> +                              enum migrate_reason reason, struct list_head *ret)
>  {
>         struct folio *dst;
>         int rc = -EAGAIN;
> @@ -1253,7 +1253,7 @@ static int migrate_folio_unmap(new_page_t get_new_page, free_page_t put_new_page
>                 /* Establish migration ptes */
>                 VM_BUG_ON_FOLIO(folio_test_anon(src) &&
>                                !folio_test_ksm(src) && !anon_vma, src);
> -               try_to_migrate(src, TTU_BATCH_FLUSH);
> +               try_to_migrate(src, batch_flush ? TTU_BATCH_FLUSH : 0);
>                 page_was_mapped = 1;
>         }
>  
> @@ -1641,6 +1641,7 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
>         bool nosplit = (reason == MR_NUMA_MISPLACED);
>         bool no_split_folio_counting = false;
>         bool avoid_force_lock;
> +       bool batch_flush = !list_is_singular(from);
>  
>  retry:
>         rc_saved = 0;
> @@ -1690,7 +1691,7 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
>  
>                         rc = migrate_folio_unmap(get_new_page, put_new_page, private,
>                                                  folio, &dst, pass > 2, avoid_force_lock,
> -                                                mode, reason, ret_folios);
> +                                                batch_flush, mode, reason, ret_folios);
>                         /*
>                          * The rules are:
>                          *      Success: folio will be freed
> @@ -1804,7 +1805,8 @@ static int migrate_pages_batch(struct list_head *from, new_page_t get_new_page,
>         stats->nr_failed_pages += nr_retry_pages;
>  move:
>         /* Flush TLBs for all unmapped folios */
> -       try_to_unmap_flush();
> +       if (batch_flush)
> +               try_to_unmap_flush();
>  
>         retry = 1;
>         for (pass = 0;





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux