[linus:master] [mm] 4ed4379881: will-it-scale.per_thread_ops 122.2% improvement

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




Hello,

kernel test robot noticed a 122.2% improvement of will-it-scale.per_thread_ops on:


commit: 4ed4379881aa62588aba6442a9f362a8cf7624e6 ("mm: handle shared faults under the VMA lock")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

testcase: will-it-scale
test machine: 104 threads 2 sockets (Skylake) with 192G memory
parameters:

	nr_task: 16
	mode: thread
	test: page_fault3
	cpufreq_governor: performance


In addition to that, the commit also has significant impact on the following tests:

+------------------+-------------------------------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_process_ops 7.8% improvement                                   |
| test machine     | 104 threads 2 sockets (Skylake) with 192G memory                                                |
| test parameters  | cpufreq_governor=performance                                                                    |
|                  | mode=process                                                                                    |
|                  | nr_task=16                                                                                      |
|                  | test=page_fault3                                                                                |
+------------------+-------------------------------------------------------------------------------------------------+



Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20231115/202311151610.43a1e565-oliver.sang@xxxxxxxxx

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-12/performance/x86_64-rhel-8.3/thread/16/debian-11.1-x86_64-20220510.cgz/lkp-skl-fpga01/page_fault3/will-it-scale

commit: 
  164b06f238 ("mm: call wp_page_copy() under the VMA lock")
  4ed4379881 ("mm: handle shared faults under the VMA lock")

164b06f238b98631 4ed4379881aa62588aba6442a9f 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    150146 ± 13%     +19.7%     179652 ±  7%  numa-meminfo.node0.Slab
     32317 ±  5%      -5.6%      30493        uptime.idle
     11.42 ±  2%      -1.2       10.22 ±  2%  mpstat.cpu.all.sys%
      1.86 ±  3%      +2.3        4.14        mpstat.cpu.all.usr%
   2648761           +89.3%    5013662        numa-numastat.node0.local_node
   2696913 ±  2%     +88.3%    5077488        numa-numastat.node0.numa_hit
   2696747 ±  2%     +88.3%    5077459        numa-vmstat.node0.numa_hit
   2648596           +89.3%    5013633        numa-vmstat.node0.numa_local
    107.33 ± 14%     +72.8%     185.50 ± 11%  perf-c2c.DRAM.local
      2752 ± 12%     -39.0%       1678 ± 15%  perf-c2c.HITM.local
      6748 ±  2%     -59.5%       2730        vmstat.system.cs
    175383           +76.1%     308919        vmstat.system.in
   3301453          +122.2%    7336336        will-it-scale.16.threads
     84.29            -1.1%      83.34        will-it-scale.16.threads_idle
    206340          +122.2%     458520        will-it-scale.per_thread_ops
   3301453          +122.2%    7336336        will-it-scale.workload
    263502 ±  2%      +4.7%     275788        proc-vmstat.nr_mapped
   3322833           +72.3%    5723978        proc-vmstat.numa_hit
   3215175           +74.7%    5616311        proc-vmstat.numa_local
   3408340           +70.6%    5814446        proc-vmstat.pgalloc_normal
 9.943e+08          +122.1%  2.208e+09        proc-vmstat.pgfault
   3359696           +71.6%    5765300        proc-vmstat.pgfree
   1102135 ±  9%     +19.8%    1320904 ±  6%  sched_debug.cfs_rq:/.avg_vruntime.max
   1102135 ±  9%     +19.8%    1320904 ±  6%  sched_debug.cfs_rq:/.min_vruntime.max
    862.58 ±  3%     +13.3%     976.92 ±  6%  sched_debug.cfs_rq:/.runnable_avg.max
    862.44 ±  3%     +13.2%     976.50 ±  6%  sched_debug.cfs_rq:/.util_avg.max
    286.93 ±  6%     +12.1%     321.76 ±  4%  sched_debug.cfs_rq:/.util_est_enqueued.stddev
    202058 ±  8%     -35.9%     129538 ±  3%  sched_debug.cpu.avg_idle.stddev
     11549 ±  6%     -49.0%       5886        sched_debug.cpu.nr_switches.avg
     13882 ±  9%     -36.2%       8854 ± 11%  sched_debug.cpu.nr_switches.stddev
    450048 ±  4%     -92.1%      35653 ±  4%  turbostat.C1
      0.40 ±  5%      -0.4        0.02 ± 28%  turbostat.C1%
    986845           -75.4%     242529 ±  6%  turbostat.C1E
      1.07 ±  4%      -0.8        0.30 ± 12%  turbostat.C1E%
      0.08 ±  5%     +62.0%       0.14 ±  3%  turbostat.IPC
  76389757          +106.4%  1.577e+08        turbostat.IRQ
    218.21            +6.4%     232.07        turbostat.PkgWatt
     20.26            +2.6%      20.79        turbostat.RAMWatt
      0.01 ±145%     -94.2%       0.00 ±111%  perf-sched.sch_delay.avg.ms.__cond_resched.shmem_get_folio_gfp.shmem_fault.__do_fault.do_fault
      0.00 ±  9%     -78.3%       0.00 ± 82%  perf-sched.sch_delay.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_exc_page_fault
      0.00 ±  9%     -73.9%       0.00 ± 57%  perf-sched.sch_delay.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt
      0.00 ± 14%     -87.5%       0.00 ± 99%  perf-sched.sch_delay.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi
      0.02 ±164%    -100.0%       0.00        perf-sched.sch_delay.avg.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
      0.01 ± 58%    -100.0%       0.00        perf-sched.sch_delay.avg.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read_killable.lock_mm_and_find_vma
    167.10 ±223%    +200.2%     501.66 ± 99%  perf-sched.sch_delay.max.ms.pipe_read.vfs_read.ksys_read.do_syscall_64
      0.03 ±160%    -100.0%       0.00        perf-sched.sch_delay.max.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
     12.57 ± 58%    -100.0%       0.00        perf-sched.sch_delay.max.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read_killable.lock_mm_and_find_vma
      0.01 ± 23%     -54.1%       0.01 ± 67%  perf-sched.sch_delay.max.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write_killable.vm_mmap_pgoff
     47.80 ±  2%    +123.9%     107.01 ±  3%  perf-sched.total_wait_and_delay.average.ms
     18466 ±  2%     -55.7%       8181 ±  3%  perf-sched.total_wait_and_delay.count.ms
     47.77 ±  2%    +123.6%     106.82 ±  3%  perf-sched.total_wait_time.average.ms
      2.79 ± 10%    -100.0%       0.00        perf-sched.wait_and_delay.avg.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
      0.69 ±  9%    -100.0%       0.00        perf-sched.wait_and_delay.avg.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read_killable.lock_mm_and_find_vma
      0.82 ± 10%    +125.0%       1.85 ± 14%  perf-sched.wait_and_delay.avg.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write_killable.__vm_munmap
      0.12 ± 33%    -100.0%       0.00        perf-sched.wait_and_delay.avg.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write_killable.vm_mmap_pgoff
    143.67 ±  5%    -100.0%       0.00        perf-sched.wait_and_delay.count.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
     11285 ±  3%    -100.0%       0.00        perf-sched.wait_and_delay.count.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read_killable.lock_mm_and_find_vma
    971.50 ±  2%    +117.6%       2114 ±  3%  perf-sched.wait_and_delay.count.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write_killable.__vm_munmap
    279.17 ±  4%    -100.0%       0.00        perf-sched.wait_and_delay.count.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write_killable.vm_mmap_pgoff
      4.85 ±  6%    -100.0%       0.00        perf-sched.wait_and_delay.max.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
     13.27 ± 47%    -100.0%       0.00        perf-sched.wait_and_delay.max.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read_killable.lock_mm_and_find_vma
      5.25 ± 25%    -100.0%       0.00        perf-sched.wait_and_delay.max.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write_killable.vm_mmap_pgoff
     33.35 ± 14%     -85.5%       4.84 ±  7%  perf-sched.wait_and_delay.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
      2.77 ± 11%    -100.0%       0.00        perf-sched.wait_time.avg.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
      0.68 ±  9%    -100.0%       0.00        perf-sched.wait_time.avg.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read_killable.lock_mm_and_find_vma
      0.81 ±  9%    +127.0%       1.84 ± 14%  perf-sched.wait_time.avg.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write_killable.__vm_munmap
      4.84 ±  6%    -100.0%       0.00        perf-sched.wait_time.max.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm
      4.27          -100.0%       0.00        perf-sched.wait_time.max.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read_killable.lock_mm_and_find_vma
     32.32 ± 18%     -85.1%       4.83 ±  7%  perf-sched.wait_time.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
 2.627e+09 ±  3%     +73.7%  4.562e+09        perf-stat.i.branch-instructions
      1.17 ± 29%      -0.3        0.89 ± 20%  perf-stat.i.branch-miss-rate%
  23408547 ±  2%     +49.2%   34925457        perf-stat.i.branch-misses
      5.77 ± 22%      +3.2        9.00 ± 17%  perf-stat.i.cache-miss-rate%
   8222112 ±  2%     +64.8%   13546123        perf-stat.i.cache-misses
      6727 ±  2%     -60.5%       2660        perf-stat.i.context-switches
      3.50 ±  5%     -38.4%       2.15 ±  5%  perf-stat.i.cpi
  4.18e+10 ±  2%      +7.6%  4.497e+10        perf-stat.i.cpu-cycles
      5076 ±  2%     -34.6%       3320        perf-stat.i.cycles-between-cache-misses
   7634922          +121.9%   16944094        perf-stat.i.dTLB-load-misses
 3.135e+09 ±  3%     +78.2%  5.587e+09        perf-stat.i.dTLB-loads
      5.14 ±  2%      +1.4        6.57        perf-stat.i.dTLB-store-miss-rate%
  94623040 ±  3%    +128.1%  2.158e+08        perf-stat.i.dTLB-store-misses
 1.713e+09 ±  3%     +77.4%  3.039e+09        perf-stat.i.dTLB-stores
     82.69            +4.3       87.04        perf-stat.i.iTLB-load-miss-rate%
   4833304 ±  3%    +121.4%   10702487 ±  3%  perf-stat.i.iTLB-load-misses
    996162           +57.1%    1564855        perf-stat.i.iTLB-loads
 1.251e+10 ±  3%     +73.2%  2.167e+10        perf-stat.i.instructions
      2571 ±  2%     -21.0%       2030 ±  3%  perf-stat.i.instructions-per-iTLB-miss
      0.30 ±  2%     +62.0%       0.48        perf-stat.i.ipc
      0.40 ±  2%      +7.6%       0.43        perf-stat.i.metric.GHz
      1068 ±  2%     -75.6%     260.15 ± 10%  perf-stat.i.metric.K/sec
     73.25 ±  3%     +77.9%     130.33        perf-stat.i.metric.M/sec
   3223417 ±  3%    +124.8%    7247446        perf-stat.i.minor-faults
     26.42 ±  7%     -10.4       16.06 ±  3%  perf-stat.i.node-load-miss-rate%
    282161 ±  5%     +98.6%     560500 ±  7%  perf-stat.i.node-loads
   3248470 ±  3%    +124.2%    7282664        perf-stat.i.node-stores
   3223417 ±  3%    +124.8%    7247446        perf-stat.i.page-faults
      0.66            -4.9%       0.62        perf-stat.overall.MPKI
      0.89            -0.1        0.77        perf-stat.overall.branch-miss-rate%
      5.72 ± 22%      +3.2        8.95 ± 17%  perf-stat.overall.cache-miss-rate%
      3.34           -37.9%       2.07        perf-stat.overall.cpi
      5085 ±  2%     -34.7%       3319        perf-stat.overall.cycles-between-cache-misses
      0.24            +0.1        0.30        perf-stat.overall.dTLB-load-miss-rate%
      5.24            +1.4        6.63        perf-stat.overall.dTLB-store-miss-rate%
     82.90            +4.3       87.23        perf-stat.overall.iTLB-load-miss-rate%
      2589           -21.7%       2027 ±  3%  perf-stat.overall.instructions-per-iTLB-miss
      0.30           +61.0%       0.48        perf-stat.overall.ipc
     25.70 ±  4%     -10.0       15.74 ±  7%  perf-stat.overall.node-load-miss-rate%
      0.63 ± 13%      -0.3        0.30 ±  7%  perf-stat.overall.node-store-miss-rate%
   1168062           -23.0%     899539        perf-stat.overall.path-length
 2.618e+09 ±  3%     +73.6%  4.547e+09        perf-stat.ps.branch-instructions
  23334434 ±  2%     +49.2%   34805568        perf-stat.ps.branch-misses
   8195263 ±  2%     +64.8%   13501705        perf-stat.ps.cache-misses
      6706 ±  2%     -60.5%       2651        perf-stat.ps.context-switches
 4.167e+10 ±  2%      +7.6%  4.482e+10        perf-stat.ps.cpu-cycles
   7610946          +121.9%   16890018        perf-stat.ps.dTLB-load-misses
 3.125e+09 ±  3%     +78.2%  5.569e+09        perf-stat.ps.dTLB-loads
  94332447 ±  3%    +128.1%  2.151e+08        perf-stat.ps.dTLB-store-misses
 1.707e+09 ±  3%     +77.4%  3.029e+09        perf-stat.ps.dTLB-stores
   4818281 ±  3%    +121.4%   10668153 ±  3%  perf-stat.ps.iTLB-load-misses
    993104           +57.1%    1559797        perf-stat.ps.iTLB-loads
 1.248e+10 ±  3%     +73.2%   2.16e+10        perf-stat.ps.instructions
   3213492 ±  3%    +124.8%    7224439        perf-stat.ps.minor-faults
    281299 ±  5%     +98.6%     558721 ±  7%  perf-stat.ps.node-loads
   3238382 ±  3%    +124.2%    7259432        perf-stat.ps.node-stores
   3213492 ±  3%    +124.8%    7224439        perf-stat.ps.page-faults
 3.856e+12           +71.1%  6.599e+12        perf-stat.total.instructions
     46.83 ±  2%     -24.6       22.23 ±  3%  perf-profile.calltrace.cycles-pp.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
     47.07 ±  2%     -24.4       22.68 ±  3%  perf-profile.calltrace.cycles-pp.exc_page_fault.asm_exc_page_fault.testcase
     16.08 ±  6%     -16.1        0.00        perf-profile.calltrace.cycles-pp.lock_mm_and_find_vma.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
     14.24 ±  2%     -14.2        0.00        perf-profile.calltrace.cycles-pp.down_read_trylock.lock_mm_and_find_vma.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
     62.50            -7.9       54.60 ±  2%  perf-profile.calltrace.cycles-pp.asm_exc_page_fault.testcase
     63.71            -6.9       56.82 ±  2%  perf-profile.calltrace.cycles-pp.testcase
      6.62            -6.6        0.00        perf-profile.calltrace.cycles-pp.up_read.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
     18.65            -6.2       12.49 ±  2%  perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
     20.30            -5.0       15.35 ±  3%  perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
      1.34 ±  6%      -0.2        1.17 ±  4%  perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
      1.24 ±  6%      -0.2        1.07 ±  5%  perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call
      0.84 ±  7%      -0.1        0.72 ±  7%  perf-profile.calltrace.cycles-pp.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter
      0.82 ±  7%      -0.1        0.71 ±  7%  perf-profile.calltrace.cycles-pp.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state
      0.66 ±  7%      -0.1        0.59 ±  6%  perf-profile.calltrace.cycles-pp.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
      0.00            +0.5        0.53 ±  2%  perf-profile.calltrace.cycles-pp.xas_descend.xas_load.filemap_get_entry.shmem_get_folio_gfp.shmem_fault
      0.00            +0.6        0.56 ±  2%  perf-profile.calltrace.cycles-pp.__mod_lruvec_page_state.page_remove_rmap.tlb_flush_rmaps.zap_pte_range.zap_pmd_range
      0.00            +0.6        0.56 ±  7%  perf-profile.calltrace.cycles-pp.inode_needs_update_time.file_update_time.fault_dirty_shared_page.do_fault.__handle_mm_fault
      0.58 ±  4%      +0.6        1.22 ±  2%  perf-profile.calltrace.cycles-pp.page_remove_rmap.tlb_flush_rmaps.zap_pte_range.zap_pmd_range.unmap_page_range
      0.00            +0.6        0.64 ±  5%  perf-profile.calltrace.cycles-pp.mtree_range_walk.mas_walk.lock_vma_under_rcu.do_user_addr_fault.exc_page_fault
      0.35 ± 70%      +0.6        0.99 ±  3%  perf-profile.calltrace.cycles-pp.xas_load.filemap_get_entry.shmem_get_folio_gfp.shmem_fault.__do_fault
      0.00            +0.7        0.66 ±  5%  perf-profile.calltrace.cycles-pp.file_update_time.fault_dirty_shared_page.do_fault.__handle_mm_fault.handle_mm_fault
      0.00            +0.7        0.66 ± 13%  perf-profile.calltrace.cycles-pp.__mod_lruvec_page_state.folio_add_file_rmap_range.set_pte_range.finish_fault.do_fault
      0.64 ±  3%      +0.7        1.35 ±  3%  perf-profile.calltrace.cycles-pp.__perf_sw_event.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
      0.00            +0.7        0.70 ±  3%  perf-profile.calltrace.cycles-pp.__pte_offset_map_lock.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault
      0.67 ±  4%      +0.7        1.39 ±  2%  perf-profile.calltrace.cycles-pp.tlb_flush_rmaps.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas
      0.35 ± 70%      +0.7        1.06 ±  4%  perf-profile.calltrace.cycles-pp.___perf_sw_event.__perf_sw_event.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
      0.36 ± 71%      +0.8        1.15 ±  6%  perf-profile.calltrace.cycles-pp.__perf_sw_event.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
      0.84 ±  4%      +0.8        1.67 ±  2%  perf-profile.calltrace.cycles-pp.filemap_get_entry.shmem_get_folio_gfp.shmem_fault.__do_fault.do_fault
      0.19 ±141%      +0.9        1.06 ±  8%  perf-profile.calltrace.cycles-pp.folio_add_file_rmap_range.set_pte_range.finish_fault.do_fault.__handle_mm_fault
      0.00            +0.9        0.90 ±  3%  perf-profile.calltrace.cycles-pp.___perf_sw_event.__perf_sw_event.handle_mm_fault.do_user_addr_fault.exc_page_fault
      0.82 ± 11%      +1.0        1.80 ±  4%  perf-profile.calltrace.cycles-pp.fault_dirty_shared_page.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
      1.21 ±  2%      +1.2        2.41 ±  2%  perf-profile.calltrace.cycles-pp.shmem_get_folio_gfp.shmem_fault.__do_fault.do_fault.__handle_mm_fault
      1.32 ± 11%      +1.2        2.52 ±  6%  perf-profile.calltrace.cycles-pp.set_pte_range.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault
      1.48 ±  2%      +1.4        2.88 ±  2%  perf-profile.calltrace.cycles-pp.shmem_fault.__do_fault.do_fault.__handle_mm_fault.handle_mm_fault
      1.41 ±  2%      +1.5        2.87 ±  2%  perf-profile.calltrace.cycles-pp.sync_regs.asm_exc_page_fault.testcase
      1.56 ±  3%      +1.5        3.03 ±  2%  perf-profile.calltrace.cycles-pp.__do_fault.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
      2.65 ±  7%      +1.6        4.30 ±  8%  perf-profile.calltrace.cycles-pp.lock_vma_under_rcu.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
      1.79 ±  8%      +1.7        3.50 ±  4%  perf-profile.calltrace.cycles-pp.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
      2.11 ±  4%      +2.3        4.46 ±  3%  perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region
      2.15 ±  4%      +2.4        4.54 ±  3%  perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap
      2.15 ±  4%      +2.4        4.55 ±  3%  perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap
      2.15 ±  4%      +2.4        4.55 ±  3%  perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap
      2.16 ±  5%      +2.4        4.56 ±  3%  perf-profile.calltrace.cycles-pp.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap.__x64_sys_munmap
      2.16 ±  5%      +2.4        4.57 ±  3%  perf-profile.calltrace.cycles-pp.do_vmi_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe
      2.16 ±  5%      +2.4        4.57 ±  3%  perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64
      2.18 ±  5%      +2.4        4.58 ±  3%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__munmap
      2.18 ±  5%      +2.4        4.58 ±  3%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
      2.18 ±  5%      +2.4        4.58 ±  3%  perf-profile.calltrace.cycles-pp.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
      2.18 ±  5%      +2.4        4.58 ±  3%  perf-profile.calltrace.cycles-pp.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
      2.18 ±  5%      +2.4        4.58 ±  3%  perf-profile.calltrace.cycles-pp.__munmap
      3.17 ±  2%      +3.2        6.36 ±  2%  perf-profile.calltrace.cycles-pp.irqentry_exit_to_user_mode.asm_exc_page_fault.testcase
      3.62 ±  3%      +3.8        7.43 ±  2%  perf-profile.calltrace.cycles-pp.error_entry.testcase
      4.88 ±  3%      +3.9        8.80 ±  2%  perf-profile.calltrace.cycles-pp.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
      3.70 ±  3%      +4.0        7.66 ±  2%  perf-profile.calltrace.cycles-pp.__irqentry_text_end.testcase
     46.92 ±  2%     -24.6       22.36 ±  3%  perf-profile.children.cycles-pp.do_user_addr_fault
     47.10 ±  2%     -24.4       22.72 ±  3%  perf-profile.children.cycles-pp.exc_page_fault
     16.11 ±  6%     -16.1        0.00        perf-profile.children.cycles-pp.lock_mm_and_find_vma
     14.41 ±  2%     -14.1        0.31 ±  4%  perf-profile.children.cycles-pp.down_read_trylock
     57.26           -13.7       43.60 ±  2%  perf-profile.children.cycles-pp.asm_exc_page_fault
      7.06            -6.8        0.30 ±  3%  perf-profile.children.cycles-pp.up_read
     18.72            -6.2       12.56 ±  2%  perf-profile.children.cycles-pp.__handle_mm_fault
     20.36            -4.9       15.45 ±  2%  perf-profile.children.cycles-pp.handle_mm_fault
     67.14            -3.0       64.11 ±  2%  perf-profile.children.cycles-pp.testcase
      1.58 ±  5%      -0.2        1.39 ±  4%  perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
      1.43 ±  5%      -0.2        1.27 ±  4%  perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
      0.98 ±  5%      -0.1        0.88 ±  5%  perf-profile.children.cycles-pp.hrtimer_interrupt
      1.00 ±  5%      -0.1        0.90 ±  5%  perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
      0.09 ± 21%      -0.1        0.03 ±102%  perf-profile.children.cycles-pp.intel_idle
      0.20 ± 10%      -0.0        0.16 ±  9%  perf-profile.children.cycles-pp.__do_softirq
      0.15 ±  7%      -0.0        0.12 ±  6%  perf-profile.children.cycles-pp.access_error
      0.06 ± 13%      +0.0        0.09 ±  7%  perf-profile.children.cycles-pp.irqentry_enter
      0.59 ±  2%      +0.1        0.64 ±  4%  perf-profile.children.cycles-pp.mtree_range_walk
      0.04 ± 45%      +0.1        0.10 ±  6%  perf-profile.children.cycles-pp.perf_swevent_event
      0.00            +0.1        0.05 ±  8%  perf-profile.children.cycles-pp.__tlb_remove_page_size
      0.06 ± 14%      +0.1        0.12 ±  8%  perf-profile.children.cycles-pp.folio_mapping
      0.00            +0.1        0.06 ±  9%  perf-profile.children.cycles-pp.llist_add_batch
      0.00            +0.1        0.06 ±  9%  perf-profile.children.cycles-pp.pte_mkwrite
      0.00            +0.1        0.06 ± 13%  perf-profile.children.cycles-pp.restore_regs_and_return_to_kernel
      0.01 ±223%      +0.1        0.08 ±  6%  perf-profile.children.cycles-pp.vm_normal_page
      0.15 ± 10%      +0.1        0.21 ±  7%  perf-profile.children.cycles-pp.__pte_offset_map
      0.06 ±  7%      +0.1        0.13 ±  5%  perf-profile.children.cycles-pp.perf_exclude_event
      0.09 ± 12%      +0.1        0.16 ±  4%  perf-profile.children.cycles-pp.error_return
      0.00            +0.1        0.08 ±  8%  perf-profile.children.cycles-pp.__cond_resched
      0.09 ±  9%      +0.1        0.18 ±  6%  perf-profile.children.cycles-pp.free_swap_cache
      0.08 ±  4%      +0.1        0.17 ± 10%  perf-profile.children.cycles-pp.xas_start
      0.10 ± 26%      +0.1        0.19 ± 19%  perf-profile.children.cycles-pp.ktime_get_coarse_real_ts64
      0.09 ± 10%      +0.1        0.18 ± 12%  perf-profile.children.cycles-pp.__count_memcg_events
      0.09 ±  6%      +0.1        0.19 ±  3%  perf-profile.children.cycles-pp.timestamp_truncate
      0.10 ±  9%      +0.1        0.21 ±  4%  perf-profile.children.cycles-pp.free_pages_and_swap_cache
      0.03 ±100%      +0.1        0.14 ±  6%  perf-profile.children.cycles-pp.native_flush_tlb_local
      0.06 ± 19%      +0.1        0.17 ±  6%  perf-profile.children.cycles-pp.default_send_IPI_mask_sequence_phys
      0.08 ± 13%      +0.1        0.20 ± 16%  perf-profile.children.cycles-pp.cgroup_rstat_updated
      0.07 ± 10%      +0.1        0.19 ±  4%  perf-profile.children.cycles-pp.flush_tlb_func
      0.14 ± 11%      +0.2        0.29 ±  4%  perf-profile.children.cycles-pp.release_pages
      0.21 ±  2%      +0.2        0.37 ±  2%  perf-profile.children.cycles-pp._raw_spin_lock
      0.15 ± 10%      +0.2        0.31 ±  3%  perf-profile.children.cycles-pp.folio_unlock
      0.11 ± 10%      +0.2        0.28 ±  5%  perf-profile.children.cycles-pp.__flush_smp_call_function_queue
      0.15 ±  5%      +0.2        0.32 ±  5%  perf-profile.children.cycles-pp._compound_head
      0.01 ±223%      +0.2        0.20 ±  5%  perf-profile.children.cycles-pp.exit_to_user_mode_prepare
      0.11 ±  9%      +0.2        0.30 ±  3%  perf-profile.children.cycles-pp.__sysvec_call_function
      0.19 ±  3%      +0.2        0.40 ±  3%  perf-profile.children.cycles-pp.__mod_memcg_lruvec_state
      0.13 ± 11%      +0.2        0.34 ±  5%  perf-profile.children.cycles-pp.flush_tlb_mm_range
      0.12 ± 11%      +0.2        0.34 ±  5%  perf-profile.children.cycles-pp.smp_call_function_many_cond
      0.12 ± 11%      +0.2        0.34 ±  5%  perf-profile.children.cycles-pp.on_each_cpu_cond_mask
      0.22 ± 13%      +0.2        0.44 ±  3%  perf-profile.children.cycles-pp.folio_mark_dirty
      0.14 ±  9%      +0.2        0.38 ±  2%  perf-profile.children.cycles-pp.sysvec_call_function
      0.19 ± 19%      +0.2        0.44 ± 12%  perf-profile.children.cycles-pp.__mod_node_page_state
      0.25 ±  6%      +0.3        0.50 ±  4%  perf-profile.children.cycles-pp.tlb_batch_pages_flush
      0.29 ±  7%      +0.3        0.55 ±  2%  perf-profile.children.cycles-pp.xas_descend
      0.28 ± 13%      +0.3        0.57 ±  7%  perf-profile.children.cycles-pp.inode_needs_update_time
      0.26 ± 17%      +0.3        0.58 ± 10%  perf-profile.children.cycles-pp.__mod_lruvec_state
      0.34 ± 10%      +0.3        0.67 ±  5%  perf-profile.children.cycles-pp.file_update_time
      0.27 ± 10%      +0.3        0.60 ±  3%  perf-profile.children.cycles-pp.noop_dirty_folio
      0.36 ±  2%      +0.4        0.72 ±  3%  perf-profile.children.cycles-pp.__pte_offset_map_lock
      0.24 ±  6%      +0.4        0.66 ±  4%  perf-profile.children.cycles-pp.asm_sysvec_call_function
      0.53 ±  5%      +0.5        1.04 ±  3%  perf-profile.children.cycles-pp.xas_load
      0.47 ± 16%      +0.6        1.08 ±  8%  perf-profile.children.cycles-pp.folio_add_file_rmap_range
      0.59 ±  4%      +0.6        1.23 ±  2%  perf-profile.children.cycles-pp.page_remove_rmap
      0.60 ± 11%      +0.7        1.30 ±  6%  perf-profile.children.cycles-pp.__mod_lruvec_page_state
      0.67 ±  4%      +0.7        1.40 ±  2%  perf-profile.children.cycles-pp.tlb_flush_rmaps
      0.84 ±  3%      +0.8        1.68 ±  2%  perf-profile.children.cycles-pp.filemap_get_entry
      0.84 ± 10%      +1.0        1.84 ±  4%  perf-profile.children.cycles-pp.fault_dirty_shared_page
      0.94 ±  2%      +1.1        2.00 ±  3%  perf-profile.children.cycles-pp.___perf_sw_event
      1.22 ±  2%      +1.2        2.43 ±  2%  perf-profile.children.cycles-pp.shmem_get_folio_gfp
      1.33 ± 12%      +1.2        2.54 ±  6%  perf-profile.children.cycles-pp.set_pte_range
      1.18 ±  4%      +1.3        2.51 ±  2%  perf-profile.children.cycles-pp.__perf_sw_event
      1.48 ±  2%      +1.4        2.90 ±  2%  perf-profile.children.cycles-pp.shmem_fault
      1.56 ±  3%      +1.5        3.03 ±  2%  perf-profile.children.cycles-pp.__do_fault
      1.46 ±  2%      +1.5        2.98 ±  2%  perf-profile.children.cycles-pp.sync_regs
      2.66 ±  7%      +1.7        4.31 ±  8%  perf-profile.children.cycles-pp.lock_vma_under_rcu
      1.82 ±  8%      +1.7        3.56 ±  4%  perf-profile.children.cycles-pp.finish_fault
      1.98 ±  2%      +2.0        4.00 ±  2%  perf-profile.children.cycles-pp.native_irq_return_iret
      2.29 ±  5%      +2.4        4.68 ±  3%  perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
      2.28 ±  5%      +2.4        4.68 ±  3%  perf-profile.children.cycles-pp.do_syscall_64
      2.15 ±  4%      +2.4        4.55 ±  3%  perf-profile.children.cycles-pp.unmap_vmas
      2.15 ±  4%      +2.4        4.55 ±  3%  perf-profile.children.cycles-pp.unmap_page_range
      2.15 ±  4%      +2.4        4.55 ±  3%  perf-profile.children.cycles-pp.zap_pmd_range
      2.15 ±  4%      +2.4        4.55 ±  3%  perf-profile.children.cycles-pp.zap_pte_range
      2.17 ±  4%      +2.4        4.57 ±  3%  perf-profile.children.cycles-pp.do_vmi_align_munmap
      2.17 ±  4%      +2.4        4.57 ±  3%  perf-profile.children.cycles-pp.do_vmi_munmap
      2.16 ±  5%      +2.4        4.56 ±  3%  perf-profile.children.cycles-pp.unmap_region
      2.18 ±  4%      +2.4        4.58 ±  3%  perf-profile.children.cycles-pp.__vm_munmap
      2.18 ±  5%      +2.4        4.58 ±  3%  perf-profile.children.cycles-pp.__x64_sys_munmap
      2.18 ±  5%      +2.4        4.58 ±  3%  perf-profile.children.cycles-pp.__munmap
      3.23 ±  2%      +3.3        6.52 ±  2%  perf-profile.children.cycles-pp.irqentry_exit_to_user_mode
      3.67 ±  3%      +3.8        7.51 ±  2%  perf-profile.children.cycles-pp.error_entry
      4.93 ±  3%      +4.0        8.89 ±  2%  perf-profile.children.cycles-pp.do_fault
      3.71 ±  3%      +4.0        7.66 ±  2%  perf-profile.children.cycles-pp.__irqentry_text_end
     14.34 ±  2%     -14.0        0.31 ±  4%  perf-profile.self.cycles-pp.down_read_trylock
     13.36 ±  3%     -10.0        3.32 ±  5%  perf-profile.self.cycles-pp.__handle_mm_fault
      7.01            -6.7        0.30 ±  3%  perf-profile.self.cycles-pp.up_read
      0.09 ± 21%      -0.1        0.03 ±102%  perf-profile.self.cycles-pp.intel_idle
      0.12 ±  5%      -0.0        0.09 ±  9%  perf-profile.self.cycles-pp.pte_offset_map_nolock
      0.14 ±  4%      -0.0        0.12 ±  7%  perf-profile.self.cycles-pp.access_error
      0.04 ± 44%      +0.0        0.08 ±  8%  perf-profile.self.cycles-pp.file_update_time
      0.06 ± 13%      +0.0        0.11 ±  6%  perf-profile.self.cycles-pp.__count_memcg_events
      0.59 ±  2%      +0.0        0.63 ±  4%  perf-profile.self.cycles-pp.mtree_range_walk
      0.03 ±100%      +0.1        0.08 ± 13%  perf-profile.self.cycles-pp.__do_fault
      0.04 ± 45%      +0.1        0.10 ±  9%  perf-profile.self.cycles-pp.perf_swevent_event
      0.00            +0.1        0.05 ±  8%  perf-profile.self.cycles-pp.__tlb_remove_page_size
      0.00            +0.1        0.05 ±  8%  perf-profile.self.cycles-pp.flush_tlb_func
      0.05 ±  7%      +0.1        0.11 ±  8%  perf-profile.self.cycles-pp.perf_exclude_event
      0.00            +0.1        0.06 ± 11%  perf-profile.self.cycles-pp.restore_regs_and_return_to_kernel
      0.00            +0.1        0.06 ± 15%  perf-profile.self.cycles-pp.irqentry_enter
      0.00            +0.1        0.06 ±  9%  perf-profile.self.cycles-pp.llist_add_batch
      0.00            +0.1        0.06 ±  9%  perf-profile.self.cycles-pp.pte_mkwrite
      0.05 ± 45%      +0.1        0.11 ±  6%  perf-profile.self.cycles-pp.folio_mapping
      0.14 ± 10%      +0.1        0.20 ±  8%  perf-profile.self.cycles-pp.__pte_offset_map
      0.06 ± 11%      +0.1        0.13 ±  6%  perf-profile.self.cycles-pp.error_return
      0.00            +0.1        0.07 ± 10%  perf-profile.self.cycles-pp.vm_normal_page
      0.06 ± 11%      +0.1        0.14 ±  8%  perf-profile.self.cycles-pp.__mod_lruvec_state
      0.08 ±  8%      +0.1        0.16 ±  6%  perf-profile.self.cycles-pp.free_swap_cache
      0.00            +0.1        0.08 ±  8%  perf-profile.self.cycles-pp.smp_call_function_many_cond
      0.08 ±  6%      +0.1        0.16 ± 11%  perf-profile.self.cycles-pp.xas_start
      0.10 ± 26%      +0.1        0.18 ± 20%  perf-profile.self.cycles-pp.ktime_get_coarse_real_ts64
      0.09 ±  9%      +0.1        0.18 ±  5%  perf-profile.self.cycles-pp.tlb_flush_rmaps
      0.08 ±  5%      +0.1        0.18 ±  3%  perf-profile.self.cycles-pp.timestamp_truncate
      0.10 ± 14%      +0.1        0.20 ±  4%  perf-profile.self.cycles-pp.inode_needs_update_time
      0.06 ± 19%      +0.1        0.17 ±  6%  perf-profile.self.cycles-pp.default_send_IPI_mask_sequence_phys
      0.03 ±100%      +0.1        0.14 ±  7%  perf-profile.self.cycles-pp.native_flush_tlb_local
      0.08 ± 12%      +0.1        0.19 ± 17%  perf-profile.self.cycles-pp.cgroup_rstat_updated
      0.10 ±  4%      +0.1        0.23 ±  7%  perf-profile.self.cycles-pp._compound_head
      0.13 ±  7%      +0.1        0.26 ±  4%  perf-profile.self.cycles-pp.exc_page_fault
      0.13 ±  5%      +0.1        0.27 ±  3%  perf-profile.self.cycles-pp.__mod_memcg_lruvec_state
      0.14 ± 10%      +0.1        0.28 ±  5%  perf-profile.self.cycles-pp.release_pages
      0.14 ±  4%      +0.1        0.29 ±  5%  perf-profile.self.cycles-pp.__pte_offset_map_lock
      0.12 ±  7%      +0.1        0.27 ±  4%  perf-profile.self.cycles-pp.finish_fault
      0.14 ±  9%      +0.2        0.29 ±  2%  perf-profile.self.cycles-pp.folio_unlock
      0.21 ±  2%      +0.2        0.37 ±  3%  perf-profile.self.cycles-pp._raw_spin_lock
      0.16 ± 14%      +0.2        0.32 ±  5%  perf-profile.self.cycles-pp.folio_mark_dirty
      0.17 ±  9%      +0.2        0.34 ±  3%  perf-profile.self.cycles-pp.xas_load
      0.18 ±  8%      +0.2        0.36 ±  5%  perf-profile.self.cycles-pp.asm_exc_page_fault
      0.15 ± 15%      +0.2        0.33 ±  7%  perf-profile.self.cycles-pp.__mod_lruvec_page_state
      0.27 ± 11%      +0.2        0.45 ±  8%  perf-profile.self.cycles-pp.do_fault
      0.00            +0.2        0.18 ±  5%  perf-profile.self.cycles-pp.exit_to_user_mode_prepare
      0.27 ±  3%      +0.2        0.47 ±  6%  perf-profile.self.cycles-pp.shmem_fault
      0.16 ± 18%      +0.2        0.38 ±  8%  perf-profile.self.cycles-pp.fault_dirty_shared_page
      0.18 ±  6%      +0.2        0.40 ±  5%  perf-profile.self.cycles-pp.folio_add_file_rmap_range
      0.19 ± 20%      +0.2        0.42 ± 14%  perf-profile.self.cycles-pp.__mod_node_page_state
      0.27 ±  7%      +0.2        0.51 ±  3%  perf-profile.self.cycles-pp.xas_descend
      0.24 ± 12%      +0.3        0.51 ±  9%  perf-profile.self.cycles-pp.__perf_sw_event
      0.34 ±  4%      +0.3        0.61 ±  3%  perf-profile.self.cycles-pp.do_user_addr_fault
      0.25 ±  9%      +0.3        0.58 ±  4%  perf-profile.self.cycles-pp.noop_dirty_folio
      0.28 ±  5%      +0.3        0.60 ±  5%  perf-profile.self.cycles-pp.set_pte_range
      0.28 ±  5%      +0.3        0.60 ±  4%  perf-profile.self.cycles-pp.page_remove_rmap
      0.34 ±  4%      +0.3        0.66 ±  3%  perf-profile.self.cycles-pp.shmem_get_folio_gfp
      0.32 ±  3%      +0.3        0.65 ±  3%  perf-profile.self.cycles-pp.filemap_get_entry
      0.66 ±  5%      +0.7        1.39 ±  3%  perf-profile.self.cycles-pp.zap_pte_range
      0.82 ±  3%      +0.9        1.76 ±  3%  perf-profile.self.cycles-pp.___perf_sw_event
      1.46 ±  2%      +1.5        2.98 ±  2%  perf-profile.self.cycles-pp.sync_regs
      1.97 ±  2%      +2.0        3.99 ±  2%  perf-profile.self.cycles-pp.native_irq_return_iret
      3.20 ±  2%      +3.1        6.33 ±  2%  perf-profile.self.cycles-pp.irqentry_exit_to_user_mode
      3.65 ±  3%      +3.8        7.47 ±  2%  perf-profile.self.cycles-pp.error_entry
      3.71 ±  3%      +4.0        7.66 ±  2%  perf-profile.self.cycles-pp.__irqentry_text_end
      5.82 ±  2%      +6.4       12.22 ±  2%  perf-profile.self.cycles-pp.testcase


***************************************************************************************************
lkp-skl-fpga01: 104 threads 2 sockets (Skylake) with 192G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-12/performance/x86_64-rhel-8.3/process/16/debian-11.1-x86_64-20220510.cgz/lkp-skl-fpga01/page_fault3/will-it-scale

commit: 
  164b06f238 ("mm: call wp_page_copy() under the VMA lock")
  4ed4379881 ("mm: handle shared faults under the VMA lock")

164b06f238b98631 4ed4379881aa62588aba6442a9f 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      3.93 ±  9%      +0.6        4.48 ±  3%  mpstat.cpu.all.usr%
     69443 ± 37%     -43.5%      39209 ± 52%  numa-numastat.node0.other_node
     69443 ± 37%     -43.5%      39209 ± 52%  numa-vmstat.node0.numa_other
    815.58 ±  4%     -13.9%     702.42 ±  6%  sched_debug.cpu.nr_switches.min
      0.17           -11.8%       0.15        turbostat.IPC
   7829312            +7.8%    8442054        will-it-scale.16.processes
    489331            +7.8%     527628        will-it-scale.per_process_ops
   7829312            +7.8%    8442054        will-it-scale.workload
   6054588            +5.6%    6393936        proc-vmstat.numa_hit
   5946949            +5.7%    6286318        proc-vmstat.numa_local
   6138630            +5.6%    6479594        proc-vmstat.pgalloc_normal
 2.356e+09            +7.8%  2.541e+09        proc-vmstat.pgfault
   6094218            +5.6%    6435293        proc-vmstat.pgfree
  33370577 ±  5%      +8.1%   36086535 ±  2%  perf-stat.i.branch-misses
  13591855 ±  6%      +9.3%   14849901 ±  3%  perf-stat.i.cache-misses
  87506837            +2.6%   89773400        perf-stat.i.cache-references
   9248232 ±  6%     +10.3%   10201074 ±  3%  perf-stat.i.dTLB-load-misses
      4.89 ±  8%      +1.4        6.25 ±  3%  perf-stat.i.dTLB-store-miss-rate%
 2.039e+08 ±  9%     +14.0%  2.324e+08 ±  3%  perf-stat.i.dTLB-store-misses
      2165 ±  5%     -18.8%       1758 ±  7%  perf-stat.i.instructions-per-iTLB-miss
   7100967 ±  9%     +13.9%    8087600 ±  3%  perf-stat.i.minor-faults
   7137308 ±  9%     +13.9%    8128127 ±  3%  perf-stat.i.node-stores
   7100971 ±  9%     +13.9%    8087600 ±  3%  perf-stat.i.page-faults
      0.51 ±  3%     +21.1%       0.62        perf-stat.overall.MPKI
      0.60 ±  4%      +0.1        0.72        perf-stat.overall.branch-miss-rate%
      1.64           +16.4%       1.91        perf-stat.overall.cpi
      0.14 ±  3%      +0.0        0.17        perf-stat.overall.dTLB-load-miss-rate%
      5.33            +1.2        6.49        perf-stat.overall.dTLB-store-miss-rate%
      2276 ±  5%     -21.9%       1779 ±  6%  perf-stat.overall.instructions-per-iTLB-miss
      0.61           -14.1%       0.52        perf-stat.overall.ipc
   1126412           -20.9%     890493        perf-stat.overall.path-length
  33278957 ±  4%      +8.1%   35982000 ±  2%  perf-stat.ps.branch-misses
  13557468 ±  6%      +9.2%   14806522 ±  3%  perf-stat.ps.cache-misses
  87230707            +2.6%   89485467        perf-stat.ps.cache-references
   9225940 ±  6%     +10.3%   10171834 ±  3%  perf-stat.ps.dTLB-load-misses
 2.035e+08 ±  9%     +13.9%  2.318e+08 ±  3%  perf-stat.ps.dTLB-store-misses
   7085678 ±  9%     +13.8%    8065196 ±  3%  perf-stat.ps.minor-faults
   7121860 ±  9%     +13.8%    8105529 ±  3%  perf-stat.ps.node-stores
   7085683 ±  9%     +13.8%    8065196 ±  3%  perf-stat.ps.page-faults
 8.819e+12           -14.8%  7.518e+12        perf-stat.total.instructions
     23.48 ±  2%      -3.4       20.08        perf-profile.calltrace.cycles-pp.exc_page_fault.asm_exc_page_fault.testcase
     22.94 ±  2%      -3.3       19.65        perf-profile.calltrace.cycles-pp.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
     16.46 ±  2%      -1.6       14.84        perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
     12.98 ±  2%      -1.1       11.92        perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
      0.58 ±  2%      +0.0        0.62        perf-profile.calltrace.cycles-pp.__mod_lruvec_page_state.page_remove_rmap.tlb_flush_rmaps.zap_pte_range.zap_pmd_range
      0.59 ±  6%      +0.1        0.64 ±  3%  perf-profile.calltrace.cycles-pp.xas_descend.xas_load.filemap_get_entry.shmem_get_folio_gfp.shmem_fault
      0.53 ±  4%      +0.1        0.59 ±  3%  perf-profile.calltrace.cycles-pp.inode_needs_update_time.file_update_time.fault_dirty_shared_page.do_fault.__handle_mm_fault
      1.08 ±  2%      +0.1        1.14 ±  3%  perf-profile.calltrace.cycles-pp.___perf_sw_event.__perf_sw_event.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
      0.64 ±  4%      +0.1        0.71 ±  2%  perf-profile.calltrace.cycles-pp.file_update_time.fault_dirty_shared_page.do_fault.__handle_mm_fault.handle_mm_fault
      1.05 ±  4%      +0.1        1.12        perf-profile.calltrace.cycles-pp.xas_load.filemap_get_entry.shmem_get_folio_gfp.shmem_fault.__do_fault
      0.76 ±  4%      +0.1        0.86 ±  3%  perf-profile.calltrace.cycles-pp.__mod_lruvec_page_state.folio_add_file_rmap_range.set_pte_range.finish_fault.do_fault
      0.92 ±  2%      +0.1        1.01        perf-profile.calltrace.cycles-pp.___perf_sw_event.__perf_sw_event.handle_mm_fault.do_user_addr_fault.exc_page_fault
      1.34 ±  3%      +0.1        1.43 ±  3%  perf-profile.calltrace.cycles-pp.__perf_sw_event.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
      0.71 ±  2%      +0.1        0.81 ±  3%  perf-profile.calltrace.cycles-pp.__pte_offset_map_lock.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault
      0.99 ±  3%      +0.1        1.09        perf-profile.calltrace.cycles-pp.mas_walk.lock_vma_under_rcu.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
      1.22 ±  2%      +0.1        1.34        perf-profile.calltrace.cycles-pp.page_remove_rmap.tlb_flush_rmaps.zap_pte_range.zap_pmd_range.unmap_page_range
      1.24 ±  2%      +0.1        1.36        perf-profile.calltrace.cycles-pp.__perf_sw_event.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
      1.40 ±  2%      +0.1        1.53        perf-profile.calltrace.cycles-pp.tlb_flush_rmaps.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas
      1.17 ±  4%      +0.1        1.30 ±  3%  perf-profile.calltrace.cycles-pp.folio_add_file_rmap_range.set_pte_range.finish_fault.do_fault.__handle_mm_fault
      1.76 ±  3%      +0.1        1.89        perf-profile.calltrace.cycles-pp.filemap_get_entry.shmem_get_folio_gfp.shmem_fault.__do_fault.do_fault
      1.93 ±  3%      +0.2        2.10        perf-profile.calltrace.cycles-pp.fault_dirty_shared_page.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
      2.52 ±  2%      +0.2        2.72        perf-profile.calltrace.cycles-pp.shmem_get_folio_gfp.shmem_fault.__do_fault.do_fault.__handle_mm_fault
      0.34 ± 70%      +0.2        0.55 ±  3%  perf-profile.calltrace.cycles-pp.tlb_batch_pages_flush.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas
      2.30 ±  2%      +0.2        2.51 ±  2%  perf-profile.calltrace.cycles-pp.set_pte_range.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault
      1.86 ±  3%      +0.2        2.08        perf-profile.calltrace.cycles-pp.lock_vma_under_rcu.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase
      3.02 ±  2%      +0.2        3.26        perf-profile.calltrace.cycles-pp.sync_regs.asm_exc_page_fault.testcase
      3.31 ±  2%      +0.2        3.56        perf-profile.calltrace.cycles-pp.__do_fault.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
      3.14 ±  2%      +0.2        3.40        perf-profile.calltrace.cycles-pp.shmem_fault.__do_fault.do_fault.__handle_mm_fault.handle_mm_fault
      3.32            +0.3        3.62 ±  2%  perf-profile.calltrace.cycles-pp.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
      4.22 ±  2%      +0.4        4.60        perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region
      9.50 ±  2%      +0.4        9.88        perf-profile.calltrace.cycles-pp.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
      4.32 ±  2%      +0.4        4.71        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__munmap
      4.30 ±  2%      +0.4        4.70        perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap
      4.30 ±  2%      +0.4        4.70        perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap
      4.30 ±  2%      +0.4        4.70        perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap
      4.32 ±  2%      +0.4        4.71        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
      4.32 ±  2%      +0.4        4.71        perf-profile.calltrace.cycles-pp.__munmap
      4.32 ±  2%      +0.4        4.71        perf-profile.calltrace.cycles-pp.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap.__x64_sys_munmap
      4.32 ±  2%      +0.4        4.71        perf-profile.calltrace.cycles-pp.do_vmi_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe
      4.32 ±  2%      +0.4        4.71        perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64
      4.32 ±  2%      +0.4        4.71        perf-profile.calltrace.cycles-pp.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
      4.32 ±  2%      +0.4        4.71        perf-profile.calltrace.cycles-pp.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
      6.61 ±  2%      +0.6        7.21 ±  2%  perf-profile.calltrace.cycles-pp.irqentry_exit_to_user_mode.asm_exc_page_fault.testcase
      7.67 ±  2%      +0.7        8.39        perf-profile.calltrace.cycles-pp.error_entry.testcase
      7.80 ±  2%      +0.8        8.57        perf-profile.calltrace.cycles-pp.__irqentry_text_end.testcase
     23.14 ±  2%      -3.4       19.72        perf-profile.children.cycles-pp.do_user_addr_fault
     23.52 ±  2%      -3.4       20.12        perf-profile.children.cycles-pp.exc_page_fault
     16.58 ±  2%      -1.6       14.99        perf-profile.children.cycles-pp.handle_mm_fault
     13.06 ±  2%      -1.1       11.97        perf-profile.children.cycles-pp.__handle_mm_fault
      1.38 ±  5%      -0.6        0.75        perf-profile.children.cycles-pp.mtree_range_walk
      1.05 ±  5%      -0.6        0.45 ± 10%  perf-profile.children.cycles-pp.handle_pte_fault
      0.58 ±  5%      -0.3        0.25 ±  6%  perf-profile.children.cycles-pp.pte_offset_map_nolock
      0.45 ±  3%      -0.3        0.14 ±  7%  perf-profile.children.cycles-pp.access_error
      0.65 ±  2%      -0.3        0.36 ±  4%  perf-profile.children.cycles-pp.down_read_trylock
      0.62 ±  5%      -0.3        0.34 ±  4%  perf-profile.children.cycles-pp.up_read
      0.33 ±  4%      -0.1        0.24 ±  4%  perf-profile.children.cycles-pp.__pte_offset_map
      0.20 ±  4%      +0.0        0.22 ±  6%  perf-profile.children.cycles-pp.exit_to_user_mode_prepare
      0.42 ±  3%      +0.0        0.45 ±  4%  perf-profile.children.cycles-pp.percpu_counter_add_batch
      0.16 ±  4%      +0.0        0.18 ±  6%  perf-profile.children.cycles-pp.ktime_get_coarse_real_ts64
      0.30 ±  2%      +0.0        0.33 ±  2%  perf-profile.children.cycles-pp.release_pages
      0.38 ±  5%      +0.0        0.42 ±  5%  perf-profile.children.cycles-pp.__mod_memcg_lruvec_state
      0.50 ±  3%      +0.0        0.54 ±  3%  perf-profile.children.cycles-pp.folio_mark_dirty
      0.51 ±  3%      +0.0        0.56 ±  3%  perf-profile.children.cycles-pp.tlb_batch_pages_flush
      0.35 ±  4%      +0.1        0.40 ±  4%  perf-profile.children.cycles-pp._raw_spin_lock
      0.60 ±  6%      +0.1        0.66 ±  2%  perf-profile.children.cycles-pp.xas_descend
      0.55 ±  3%      +0.1        0.60 ±  3%  perf-profile.children.cycles-pp.inode_needs_update_time
      0.54 ±  4%      +0.1        0.60 ±  3%  perf-profile.children.cycles-pp.__mod_node_page_state
      0.65 ±  4%      +0.1        0.72 ±  2%  perf-profile.children.cycles-pp.file_update_time
      1.10 ±  4%      +0.1        1.17        perf-profile.children.cycles-pp.xas_load
      0.69 ±  2%      +0.1        0.77 ±  2%  perf-profile.children.cycles-pp.__mod_lruvec_state
      1.00 ±  3%      +0.1        1.09        perf-profile.children.cycles-pp.mas_walk
      0.73 ±  2%      +0.1        0.83 ±  3%  perf-profile.children.cycles-pp.__pte_offset_map_lock
      1.24 ±  2%      +0.1        1.36        perf-profile.children.cycles-pp.page_remove_rmap
      1.41 ±  2%      +0.1        1.54        perf-profile.children.cycles-pp.tlb_flush_rmaps
      1.18 ±  4%      +0.1        1.31 ±  3%  perf-profile.children.cycles-pp.folio_add_file_rmap_range
      1.77 ±  3%      +0.1        1.90        perf-profile.children.cycles-pp.filemap_get_entry
      1.43 ±  3%      +0.1        1.57 ±  2%  perf-profile.children.cycles-pp.__mod_lruvec_page_state
      2.03            +0.2        2.19        perf-profile.children.cycles-pp.___perf_sw_event
      1.97 ±  3%      +0.2        2.14        perf-profile.children.cycles-pp.fault_dirty_shared_page
      2.54 ±  2%      +0.2        2.74        perf-profile.children.cycles-pp.shmem_get_folio_gfp
      2.31 ±  2%      +0.2        2.53 ±  2%  perf-profile.children.cycles-pp.set_pte_range
      2.61 ±  2%      +0.2        2.83        perf-profile.children.cycles-pp.__perf_sw_event
      1.86 ±  3%      +0.2        2.09        perf-profile.children.cycles-pp.lock_vma_under_rcu
      3.33 ±  2%      +0.2        3.57        perf-profile.children.cycles-pp.__do_fault
      3.10 ±  2%      +0.3        3.36 ±  2%  perf-profile.children.cycles-pp.sync_regs
      3.16 ±  2%      +0.3        3.41        perf-profile.children.cycles-pp.shmem_fault
      3.39            +0.3        3.70 ±  2%  perf-profile.children.cycles-pp.finish_fault
      9.64 ±  2%      +0.4        9.99        perf-profile.children.cycles-pp.do_fault
      4.32 ±  2%      +0.4        4.71        perf-profile.children.cycles-pp.__munmap
      4.32 ±  2%      +0.4        4.71        perf-profile.children.cycles-pp.do_vmi_munmap
      4.32 ±  2%      +0.4        4.71        perf-profile.children.cycles-pp.do_vmi_align_munmap
      4.30 ±  2%      +0.4        4.70        perf-profile.children.cycles-pp.unmap_vmas
      4.30 ±  2%      +0.4        4.70        perf-profile.children.cycles-pp.unmap_page_range
      4.30 ±  2%      +0.4        4.70        perf-profile.children.cycles-pp.zap_pmd_range
      4.30 ±  2%      +0.4        4.70        perf-profile.children.cycles-pp.zap_pte_range
      4.40 ±  2%      +0.4        4.80        perf-profile.children.cycles-pp.do_syscall_64
      4.32 ±  2%      +0.4        4.71        perf-profile.children.cycles-pp.__vm_munmap
      4.32 ±  2%      +0.4        4.71        perf-profile.children.cycles-pp.__x64_sys_munmap
      4.32 ±  2%      +0.4        4.71        perf-profile.children.cycles-pp.unmap_region
      4.41 ±  2%      +0.4        4.80        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
      4.07            +0.4        4.47        perf-profile.children.cycles-pp.native_irq_return_iret
      6.72 ±  2%      +0.6        7.32 ±  2%  perf-profile.children.cycles-pp.irqentry_exit_to_user_mode
      7.75 ±  2%      +0.7        8.48        perf-profile.children.cycles-pp.error_entry
      7.80 ±  2%      +0.8        8.57        perf-profile.children.cycles-pp.__irqentry_text_end
      2.35 ±  2%      -0.8        1.51 ±  3%  perf-profile.self.cycles-pp.__handle_mm_fault
      1.37 ±  5%      -0.6        0.75        perf-profile.self.cycles-pp.mtree_range_walk
      1.93 ±  5%      -0.6        1.36 ±  5%  perf-profile.self.cycles-pp.handle_mm_fault
      0.63 ±  2%      -0.3        0.36 ±  3%  perf-profile.self.cycles-pp.down_read_trylock
      0.47 ±  6%      -0.3        0.20 ± 17%  perf-profile.self.cycles-pp.handle_pte_fault
      0.90 ±  3%      -0.3        0.64 ±  3%  perf-profile.self.cycles-pp.do_user_addr_fault
      0.35 ±  7%      -0.2        0.12 ± 11%  perf-profile.self.cycles-pp.pte_offset_map_nolock
      0.57 ±  5%      -0.2        0.34 ±  4%  perf-profile.self.cycles-pp.up_read
      0.37 ±  4%      -0.2        0.14 ±  7%  perf-profile.self.cycles-pp.access_error
      0.70 ±  3%      -0.1        0.58 ±  3%  perf-profile.self.cycles-pp.do_fault
      0.31 ±  4%      -0.1        0.23 ±  4%  perf-profile.self.cycles-pp.__pte_offset_map
      0.29            +0.0        0.32        perf-profile.self.cycles-pp.release_pages
      0.16 ±  4%      +0.0        0.18 ±  6%  perf-profile.self.cycles-pp.ktime_get_coarse_real_ts64
      0.26 ±  4%      +0.0        0.30 ±  4%  perf-profile.self.cycles-pp.__mod_memcg_lruvec_state
      0.40 ±  4%      +0.0        0.44 ±  4%  perf-profile.self.cycles-pp.folio_add_file_rmap_range
      0.29 ±  3%      +0.0        0.33 ±  5%  perf-profile.self.cycles-pp.__pte_offset_map_lock
      0.42 ±  4%      +0.0        0.47 ±  4%  perf-profile.self.cycles-pp.fault_dirty_shared_page
      0.35 ±  3%      +0.1        0.40 ±  3%  perf-profile.self.cycles-pp._raw_spin_lock
      0.60 ±  3%      +0.1        0.66 ±  2%  perf-profile.self.cycles-pp.set_pte_range
      0.69 ±  2%      +0.1        0.75 ±  2%  perf-profile.self.cycles-pp.shmem_get_folio_gfp
      0.52 ±  3%      +0.1        0.58 ±  3%  perf-profile.self.cycles-pp.__mod_node_page_state
      0.59 ±  5%      +0.1        0.66 ±  2%  perf-profile.self.cycles-pp.page_remove_rmap
      0.54 ±  5%      +0.1        0.63 ±  2%  perf-profile.self.cycles-pp.lock_vma_under_rcu
      1.43            +0.1        1.54 ±  2%  perf-profile.self.cycles-pp.zap_pte_range
      1.80            +0.1        1.94        perf-profile.self.cycles-pp.___perf_sw_event
      3.10 ±  2%      +0.3        3.36 ±  2%  perf-profile.self.cycles-pp.sync_regs
      4.06            +0.4        4.46        perf-profile.self.cycles-pp.native_irq_return_iret
      6.52 ±  2%      +0.6        7.10 ±  2%  perf-profile.self.cycles-pp.irqentry_exit_to_user_mode
      7.72 ±  2%      +0.7        8.44        perf-profile.self.cycles-pp.error_entry
      7.80 ±  2%      +0.8        8.57        perf-profile.self.cycles-pp.__irqentry_text_end
     12.39            +1.1       13.50        perf-profile.self.cycles-pp.testcase



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux