Re: [PATCH v2 5/6] mm: Handle read faults under the VMA lock

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




Hello,

kernel test robot noticed a 46.0% improvement of vm-scalability.throughput on:


commit: 39fbbca087dd149cdb82f08e7b92d62395c21ecf ("[PATCH v2 5/6] mm: Handle read faults under the VMA lock")
url: https://github.com/intel-lab-lkp/linux/commits/Matthew-Wilcox-Oracle/mm-Make-lock_folio_maybe_drop_mmap-VMA-lock-aware/20231007-035513
base: v6.6-rc4
patch link: https://lore.kernel.org/all/20231006195318.4087158-6-willy@xxxxxxxxxxxxx/
patch subject: [PATCH v2 5/6] mm: Handle read faults under the VMA lock

testcase: vm-scalability
test machine: 96 threads 2 sockets Intel(R) Xeon(R) Platinum 8260L CPU @ 2.40GHz (Cascade Lake) with 128G memory
parameters:

	runtime: 300s
	size: 2T
	test: shm-pread-seq-mt
	cpufreq_governor: performance

test-description: The motivation behind this suite is to exercise functions and regions of the mm/ of the Linux kernel which are of interest to us.
test-url: https://git.kernel.org/cgit/linux/kernel/git/wfg/vm-scalability.git/





Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20231020/202310201715.3f52109d-oliver.sang@xxxxxxxxx

=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/runtime/size/tbox_group/test/testcase:
  gcc-12/performance/x86_64-rhel-8.3/debian-11.1-x86_64-20220510.cgz/300s/2T/lkp-csl-2sp3/shm-pread-seq-mt/vm-scalability

commit: 
  90e99527c7 ("mm: Handle COW faults under the VMA lock")
  39fbbca087 ("mm: Handle read faults under the VMA lock")

90e99527c746cd9e 39fbbca087dd149cdb82f08e7b9 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
     34.69 ± 23%     +72.5%      59.82 ±  2%  vm-scalability.free_time
    173385           +45.6%     252524        vm-scalability.median
  16599151           +46.0%   24242352        vm-scalability.throughput
    390.45            +6.9%     417.32        vm-scalability.time.elapsed_time
    390.45            +6.9%     417.32        vm-scalability.time.elapsed_time.max
     45781 ±  2%     +16.3%      53251 ±  2%  vm-scalability.time.involuntary_context_switches
 4.213e+09           +50.1%  6.325e+09        vm-scalability.time.maximum_resident_set_size
 5.316e+08           +47.3%   7.83e+08        vm-scalability.time.minor_page_faults
      6400            -8.0%       5890        vm-scalability.time.percent_of_cpu_this_job_got
     21673           -10.2%      19455        vm-scalability.time.system_time
      3319           +54.4%       5126        vm-scalability.time.user_time
 2.321e+08 ±  2%     +27.2%  2.953e+08 ±  5%  vm-scalability.time.voluntary_context_switches
 5.004e+09           +42.2%  7.116e+09        vm-scalability.workload
     13110           +24.0%      16254        uptime.idle
  1.16e+10           +24.5%  1.444e+10        cpuidle..time
 2.648e+08 ±  3%     +16.3%  3.079e+08 ±  5%  cpuidle..usage
     22.86            +6.3       29.17        mpstat.cpu.all.idle%
      8.29 ±  5%      -1.2        7.13 ±  7%  mpstat.cpu.all.iowait%
     58.63            -9.2       49.38        mpstat.cpu.all.sys%
      9.05            +4.0       13.09        mpstat.cpu.all.usr%
   8721571 ±  5%     +44.8%   12630342 ±  2%  numa-numastat.node0.local_node
   8773210 ±  5%     +44.8%   12706884 ±  2%  numa-numastat.node0.numa_hit
   7793725 ±  5%     +51.3%   11793573        numa-numastat.node1.local_node
   7842342 ±  5%     +50.7%   11816543        numa-numastat.node1.numa_hit
     23.17           +26.8%      29.37        vmstat.cpu.id
  31295414           +50.9%   47211341        vmstat.memory.cache
  95303378           -18.8%   77355720        vmstat.memory.free
   1176885 ±  2%     +19.2%    1402891 ±  3%  vmstat.system.cs
    194658            +5.4%     205149 ±  2%  vmstat.system.in
   9920198 ± 10%     -48.9%    5071533 ± 15%  turbostat.C1
      0.51 ± 12%      -0.3        0.21 ± 12%  turbostat.C1%
   1831098 ± 15%     -72.0%     512888 ± 19%  turbostat.C1E
      0.14 ± 13%      -0.1        0.06 ± 11%  turbostat.C1E%
   8736699           +36.3%   11905646        turbostat.C6
     22.74            +6.3       29.02        turbostat.C6%
     17.82           +25.5%      22.37        turbostat.CPU%c1
      5.36           +28.2%       6.87        turbostat.CPU%c6
      0.07           +42.9%       0.10        turbostat.IPC
  77317703           +12.3%   86804635 ±  3%  turbostat.IRQ
 2.443e+08 ±  3%     +18.9%  2.904e+08 ±  6%  turbostat.POLL
      4.80           +30.2%       6.24        turbostat.Pkg%pc2
    266.73            -1.3%     263.33        turbostat.PkgWatt
      0.00           -25.0%       0.00        perf-sched.sch_delay.avg.ms.io_schedule.folio_wait_bit_common.shmem_get_folio_gfp.shmem_fault
      0.06 ± 11%     -21.8%       0.04 ±  9%  perf-sched.sch_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
     26.45 ±  9%     -16.0%      22.21 ±  6%  perf-sched.sch_delay.max.ms.io_schedule.folio_wait_bit_common.shmem_get_folio_gfp.shmem_fault
      0.00           -25.0%       0.00        perf-sched.total_sch_delay.average.ms
    106.37 ±167%     -79.1%      22.21 ±  6%  perf-sched.total_sch_delay.max.ms
      0.46 ±  2%     -16.0%       0.39 ±  5%  perf-sched.total_wait_and_delay.average.ms
   2202457 ±  2%     +26.1%    2776824 ±  3%  perf-sched.total_wait_and_delay.count.ms
      0.45 ±  2%     -15.9%       0.38 ±  5%  perf-sched.total_wait_time.average.ms
      0.02 ±  2%     -19.8%       0.01 ±  2%  perf-sched.wait_and_delay.avg.ms.io_schedule.folio_wait_bit_common.shmem_get_folio_gfp.shmem_fault
    494.65 ±  4%     +10.6%     546.88 ±  3%  perf-sched.wait_and_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
   2196122 ±  2%     +26.1%    2770017 ±  3%  perf-sched.wait_and_delay.count.io_schedule.folio_wait_bit_common.shmem_get_folio_gfp.shmem_fault
      0.01 ±  3%     -19.5%       0.01        perf-sched.wait_time.avg.ms.io_schedule.folio_wait_bit_common.shmem_get_folio_gfp.shmem_fault
    494.63 ±  4%     +10.6%     546.87 ±  3%  perf-sched.wait_time.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      0.22 ± 42%     -68.8%       0.07 ±125%  perf-sched.wait_time.max.ms.__cond_resched.stop_one_cpu.migrate_task_to.task_numa_migrate.isra
  11445425           +82.1%   20837223        meminfo.Active
  11444642           +82.1%   20836443        meminfo.Active(anon)
  31218122           +51.0%   47138293        meminfo.Cached
  30006048           +53.7%   46116816        meminfo.Committed_AS
  17425032           +37.4%   23950392        meminfo.Inactive
  17423257           +37.5%   23948613        meminfo.Inactive(anon)
    164910           +21.8%     200913        meminfo.KReclaimable
  26336530           +57.6%   41514589        meminfo.Mapped
  94668993           -19.0%   76693589        meminfo.MemAvailable
  95202238           -18.9%   77208832        meminfo.MemFree
  36610737           +49.1%   54604143        meminfo.Memused
   4072810           +50.1%    6114589        meminfo.PageTables
    164910           +21.8%     200913        meminfo.SReclaimable
  28535318           +55.8%   44455489        meminfo.Shmem
    367289           +10.1%     404373        meminfo.Slab
  37978157           +50.2%   57055526        meminfo.max_used_kB
   2860756           +82.1%    5208445        proc-vmstat.nr_active_anon
   2361286           -19.0%    1912151        proc-vmstat.nr_dirty_background_threshold
   4728345           -19.0%    3828978        proc-vmstat.nr_dirty_threshold
   7804148           +51.0%   11783823        proc-vmstat.nr_file_pages
  23801109           -18.9%   19303173        proc-vmstat.nr_free_pages
   4355690           +37.5%    5986921        proc-vmstat.nr_inactive_anon
   6583645           +57.6%   10377790        proc-vmstat.nr_mapped
   1018109           +50.1%    1528565        proc-vmstat.nr_page_table_pages
   7133183           +55.8%   11112858        proc-vmstat.nr_shmem
     41226           +21.8%      50226        proc-vmstat.nr_slab_reclaimable
   2860756           +82.1%    5208445        proc-vmstat.nr_zone_active_anon
   4355690           +37.5%    5986921        proc-vmstat.nr_zone_inactive_anon
    112051            +3.8%     116273        proc-vmstat.numa_hint_faults
  16618553           +47.6%   24525492        proc-vmstat.numa_hit
  16518296           +47.9%   24425975        proc-vmstat.numa_local
  11052273           +49.9%   16566743        proc-vmstat.pgactivate
  16757533           +47.2%   24672644        proc-vmstat.pgalloc_normal
 5.329e+08           +47.2%  7.844e+08        proc-vmstat.pgfault
  16101786           +48.3%   23877738        proc-vmstat.pgfree
   3302784            +6.0%    3500288        proc-vmstat.unevictable_pgs_scanned
   6101287 ±  7%     +81.3%   11062634 ±  3%  numa-meminfo.node0.Active
   6101026 ±  7%     +81.3%   11062389 ±  3%  numa-meminfo.node0.Active(anon)
  17217355 ±  5%     +46.3%   25196100 ±  3%  numa-meminfo.node0.FilePages
   9363213 ±  7%     +31.9%   12347562 ±  2%  numa-meminfo.node0.Inactive
   9362621 ±  7%     +31.9%   12347130 ±  2%  numa-meminfo.node0.Inactive(anon)
  14211196 ±  7%     +51.2%   21487599        numa-meminfo.node0.Mapped
  45879058 ±  2%     -19.6%   36888633 ±  2%  numa-meminfo.node0.MemFree
  19925073 ±  5%     +45.1%   28915498 ±  3%  numa-meminfo.node0.MemUsed
   2032891           +50.5%    3060344        numa-meminfo.node0.PageTables
  15318197 ±  6%     +52.0%   23276446 ±  2%  numa-meminfo.node0.Shmem
   5342463 ±  7%     +82.9%    9769639 ±  4%  numa-meminfo.node1.Active
   5341941 ±  7%     +82.9%    9769104 ±  4%  numa-meminfo.node1.Active(anon)
  13998966 ±  8%     +56.6%   21919509 ±  3%  numa-meminfo.node1.FilePages
   8060699 ±  7%     +43.7%   11584190 ±  2%  numa-meminfo.node1.Inactive
   8059515 ±  7%     +43.7%   11582844 ±  2%  numa-meminfo.node1.Inactive(anon)
  12125745 ±  7%     +65.0%   20005342        numa-meminfo.node1.Mapped
  49326340 ±  2%     -18.2%   40347902 ±  2%  numa-meminfo.node1.MemFree
  16682503 ±  7%     +53.8%   25660941 ±  3%  numa-meminfo.node1.MemUsed
   2039529           +49.6%    3051247        numa-meminfo.node1.PageTables
  13214266 ±  7%     +60.1%   21155303 ±  2%  numa-meminfo.node1.Shmem
    156378 ± 13%     +21.1%     189316 ±  9%  numa-meminfo.node1.Slab
   1525784 ±  7%     +81.4%    2767183 ±  3%  numa-vmstat.node0.nr_active_anon
   4304756 ±  5%     +46.4%    6302189 ±  3%  numa-vmstat.node0.nr_file_pages
  11469263 ±  2%     -19.6%    9218468 ±  2%  numa-vmstat.node0.nr_free_pages
   2340569 ±  7%     +32.0%    3088383 ±  2%  numa-vmstat.node0.nr_inactive_anon
   3553304 ±  7%     +51.3%    5375214        numa-vmstat.node0.nr_mapped
    508315           +50.6%     765564        numa-vmstat.node0.nr_page_table_pages
   3829966 ±  6%     +52.0%    5822276 ±  2%  numa-vmstat.node0.nr_shmem
   1525783 ±  7%     +81.4%    2767184 ±  3%  numa-vmstat.node0.nr_zone_active_anon
   2340569 ±  7%     +32.0%    3088382 ±  2%  numa-vmstat.node0.nr_zone_inactive_anon
   8773341 ±  5%     +44.8%   12707017 ±  2%  numa-vmstat.node0.numa_hit
   8721702 ±  5%     +44.8%   12630474 ±  2%  numa-vmstat.node0.numa_local
   1335910 ±  7%     +82.9%    2443778 ±  4%  numa-vmstat.node1.nr_active_anon
   3500040 ±  8%     +56.7%    5482887 ±  3%  numa-vmstat.node1.nr_file_pages
  12331163 ±  2%     -18.2%   10083422 ±  2%  numa-vmstat.node1.nr_free_pages
   2014795 ±  7%     +43.8%    2897243 ±  2%  numa-vmstat.node1.nr_inactive_anon
   3031806 ±  7%     +65.1%    5004449        numa-vmstat.node1.nr_mapped
    510000           +49.7%     763297        numa-vmstat.node1.nr_page_table_pages
   3303865 ±  7%     +60.2%    5291835 ±  2%  numa-vmstat.node1.nr_shmem
   1335910 ±  7%     +82.9%    2443778 ±  4%  numa-vmstat.node1.nr_zone_active_anon
   2014795 ±  7%     +43.8%    2897242 ±  2%  numa-vmstat.node1.nr_zone_inactive_anon
   7842425 ±  5%     +50.7%   11816530        numa-vmstat.node1.numa_hit
   7793808 ±  5%     +51.3%   11793555        numa-vmstat.node1.numa_local
   9505083           +21.3%   11532590 ±  3%  sched_debug.cfs_rq:/.avg_vruntime.avg
   9551715           +21.4%   11595502 ±  3%  sched_debug.cfs_rq:/.avg_vruntime.max
   9426050           +21.4%   11443528 ±  3%  sched_debug.cfs_rq:/.avg_vruntime.min
     19249 ±  4%     +28.3%      24698 ± 10%  sched_debug.cfs_rq:/.avg_vruntime.stddev
      0.79           -30.7%       0.55 ±  8%  sched_debug.cfs_rq:/.h_nr_running.avg
     12458 ± 12%     +70.8%      21277 ± 22%  sched_debug.cfs_rq:/.load.avg
     13767 ± 95%    +311.7%      56677 ± 29%  sched_debug.cfs_rq:/.load.stddev
   9505083           +21.3%   11532590 ±  3%  sched_debug.cfs_rq:/.min_vruntime.avg
   9551715           +21.4%   11595502 ±  3%  sched_debug.cfs_rq:/.min_vruntime.max
   9426050           +21.4%   11443528 ±  3%  sched_debug.cfs_rq:/.min_vruntime.min
     19249 ±  4%     +28.3%      24698 ± 10%  sched_debug.cfs_rq:/.min_vruntime.stddev
      0.78           -30.7%       0.54 ±  8%  sched_debug.cfs_rq:/.nr_running.avg
    170.67           -21.4%     134.10 ±  6%  sched_debug.cfs_rq:/.removed.load_avg.max
    708.55           -32.2%     480.43 ±  7%  sched_debug.cfs_rq:/.runnable_avg.avg
      1510 ±  3%     -12.5%       1320 ±  4%  sched_debug.cfs_rq:/.runnable_avg.max
    219.68 ±  7%     -12.7%     191.74 ±  5%  sched_debug.cfs_rq:/.runnable_avg.stddev
    707.51           -32.3%     479.05 ±  7%  sched_debug.cfs_rq:/.util_avg.avg
      1506 ±  3%     -12.6%       1317 ±  4%  sched_debug.cfs_rq:/.util_avg.max
    219.64 ±  7%     -13.0%     191.15 ±  5%  sched_debug.cfs_rq:/.util_avg.stddev
    564.18 ±  2%     -32.4%     381.24 ±  8%  sched_debug.cfs_rq:/.util_est_enqueued.avg
      1168 ±  7%     -14.8%     995.94 ±  7%  sched_debug.cfs_rq:/.util_est_enqueued.max
    235.45 ±  5%     -21.4%     185.13 ±  7%  sched_debug.cfs_rq:/.util_est_enqueued.stddev
    149234 ±  5%    +192.0%     435707 ± 10%  sched_debug.cpu.avg_idle.avg
    404765 ± 17%     +47.3%     596259 ± 15%  sched_debug.cpu.avg_idle.max
      5455 ±  4%   +3302.8%     185624 ± 34%  sched_debug.cpu.avg_idle.min
    201990           +24.9%     252309 ±  5%  sched_debug.cpu.clock.avg
    201997           +24.9%     252315 ±  5%  sched_debug.cpu.clock.max
    201983           +24.9%     252303 ±  5%  sched_debug.cpu.clock.min
      3.80 ±  2%     -10.1%       3.42 ±  3%  sched_debug.cpu.clock.stddev
    200296           +24.8%     249952 ±  5%  sched_debug.cpu.clock_task.avg
    200541           +24.8%     250280 ±  5%  sched_debug.cpu.clock_task.max
    194086           +25.5%     243582 ±  5%  sched_debug.cpu.clock_task.min
      4069           -32.7%       2739 ±  8%  sched_debug.cpu.curr->pid.avg
      8703           +15.2%      10027 ±  3%  sched_debug.cpu.curr->pid.max
      0.00 ±  6%     -27.2%       0.00 ±  5%  sched_debug.cpu.next_balance.stddev
      0.78           -32.7%       0.52 ±  8%  sched_debug.cpu.nr_running.avg
      0.33 ±  6%     -13.9%       0.29 ±  5%  sched_debug.cpu.nr_running.stddev
   2372181 ±  2%     +57.6%    3737590 ±  8%  sched_debug.cpu.nr_switches.avg
   2448893 ±  2%     +58.5%    3880813 ±  8%  sched_debug.cpu.nr_switches.max
   2290032 ±  2%     +55.9%    3570559 ±  8%  sched_debug.cpu.nr_switches.min
     36185 ± 10%     +74.8%      63244 ±  8%  sched_debug.cpu.nr_switches.stddev
      0.10 ± 19%    +138.0%       0.23 ± 19%  sched_debug.cpu.nr_uninterruptible.avg
    201984           +24.9%     252304 ±  5%  sched_debug.cpu_clk
    201415           +25.0%     251735 ±  5%  sched_debug.ktime
    202543           +24.8%     252867 ±  5%  sched_debug.sched_clk
      3.84 ±  2%     -14.1%       3.30 ±  2%  perf-stat.i.MPKI
 1.679e+10           +30.1%  2.186e+10        perf-stat.i.branch-instructions
      0.54 ±  2%      -0.1        0.45        perf-stat.i.branch-miss-rate%
  75872684            -2.6%   73927540        perf-stat.i.branch-misses
     31.85            -1.1       30.75        perf-stat.i.cache-miss-rate%
   1184992 ±  2%     +19.1%    1411069 ±  3%  perf-stat.i.context-switches
      3.49           -29.3%       2.47        perf-stat.i.cpi
 2.265e+11            -8.1%  2.081e+11        perf-stat.i.cpu-cycles
    950.46 ±  3%     -11.6%     840.03 ±  2%  perf-stat.i.cycles-between-cache-misses
   9514714 ± 12%     +27.3%   12109471 ± 10%  perf-stat.i.dTLB-load-misses
 1.556e+10           +29.9%  2.022e+10        perf-stat.i.dTLB-loads
   1575276 ±  5%     +35.8%    2138868 ±  5%  perf-stat.i.dTLB-store-misses
 3.396e+09           +21.6%  4.129e+09        perf-stat.i.dTLB-stores
     79.97            +2.8       82.74        perf-stat.i.iTLB-load-miss-rate%
   4265612            +8.4%    4624960 ±  2%  perf-stat.i.iTLB-load-misses
    712599 ±  8%     -38.4%     438645 ±  7%  perf-stat.i.iTLB-loads
  5.59e+10           +27.7%  7.137e+10        perf-stat.i.instructions
     12120           +11.6%      13525 ±  2%  perf-stat.i.instructions-per-iTLB-miss
      0.35           +32.7%       0.46        perf-stat.i.ipc
      0.04 ± 38%    +119.0%       0.08 ± 33%  perf-stat.i.major-faults
      2.36            -8.1%       2.17        perf-stat.i.metric.GHz
    863.69            +7.5%     928.37        perf-stat.i.metric.K/sec
    378.76           +28.8%     487.87        perf-stat.i.metric.M/sec
   1359089           +37.9%    1874285        perf-stat.i.minor-faults
     84.30            -2.8       81.50        perf-stat.i.node-load-miss-rate%
     89.54            -2.5       87.09        perf-stat.i.node-store-miss-rate%
   1359089           +37.9%    1874285        perf-stat.i.page-faults
      3.65 ±  3%     -22.5%       2.82 ±  4%  perf-stat.overall.MPKI
      0.45            -0.1        0.34        perf-stat.overall.branch-miss-rate%
     32.64            -1.7       30.98        perf-stat.overall.cache-miss-rate%
      4.05           -28.0%       2.92        perf-stat.overall.cpi
      1113 ±  3%      -7.1%       1034 ±  3%  perf-stat.overall.cycles-between-cache-misses
      0.05 ±  5%      +0.0        0.05 ±  5%  perf-stat.overall.dTLB-store-miss-rate%
     85.73            +5.6       91.37        perf-stat.overall.iTLB-load-miss-rate%
     13110 ±  2%     +17.8%      15440 ±  2%  perf-stat.overall.instructions-per-iTLB-miss
      0.25           +39.0%       0.34        perf-stat.overall.ipc
      4378            -4.2%       4195        perf-stat.overall.path-length
 1.679e+10           +30.2%  2.186e+10        perf-stat.ps.branch-instructions
  75862675            -2.6%   73920168        perf-stat.ps.branch-misses
   1184994 ±  2%     +19.1%    1411192 ±  3%  perf-stat.ps.context-switches
 2.265e+11            -8.1%  2.082e+11        perf-stat.ps.cpu-cycles
   9518014 ± 12%     +27.3%   12118863 ± 10%  perf-stat.ps.dTLB-load-misses
 1.556e+10           +29.9%  2.022e+10        perf-stat.ps.dTLB-loads
   1575414 ±  5%     +35.8%    2139373 ±  5%  perf-stat.ps.dTLB-store-misses
 3.396e+09           +21.6%  4.129e+09        perf-stat.ps.dTLB-stores
   4265139            +8.4%    4625090 ±  2%  perf-stat.ps.iTLB-load-misses
    711002 ±  8%     -38.5%     437258 ±  7%  perf-stat.ps.iTLB-loads
  5.59e+10           +27.7%  7.137e+10        perf-stat.ps.instructions
      0.04 ± 37%    +118.9%       0.08 ± 33%  perf-stat.ps.major-faults
   1359186           +37.9%    1874615        perf-stat.ps.minor-faults
   1359186           +37.9%    1874615        perf-stat.ps.page-faults
 2.191e+13           +36.3%  2.986e+13        perf-stat.total.instructions
     74.66            -6.7       67.93        perf-profile.calltrace.cycles-pp.exc_page_fault.asm_exc_page_fault.do_access
     74.61            -6.7       67.89        perf-profile.calltrace.cycles-pp.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.do_access
     53.18            -6.3       46.88        perf-profile.calltrace.cycles-pp.filemap_map_pages.do_read_fault.do_fault.__handle_mm_fault.handle_mm_fault
     35.54            -6.1       29.43        perf-profile.calltrace.cycles-pp.next_uptodate_folio.filemap_map_pages.do_read_fault.do_fault.__handle_mm_fault
     76.49            -5.4       71.07        perf-profile.calltrace.cycles-pp.asm_exc_page_fault.do_access
     79.82            -3.9       75.89        perf-profile.calltrace.cycles-pp.do_access
     70.02            -3.8       66.23        perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
     70.39            -3.7       66.70        perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.do_access
     68.31            -2.8       65.51        perf-profile.calltrace.cycles-pp.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
     68.29            -2.8       65.50        perf-profile.calltrace.cycles-pp.do_read_fault.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
      0.65 ±  7%      -0.3        0.37 ± 71%  perf-profile.calltrace.cycles-pp.finish_task_switch.__schedule.schedule.io_schedule.folio_wait_bit_common
      1.94 ±  6%      -0.2        1.71 ±  6%  perf-profile.calltrace.cycles-pp.__schedule.schedule.io_schedule.folio_wait_bit_common.shmem_get_folio_gfp
      1.96 ±  6%      -0.2        1.74 ±  6%  perf-profile.calltrace.cycles-pp.io_schedule.folio_wait_bit_common.shmem_get_folio_gfp.shmem_fault.__do_fault
      1.95 ±  6%      -0.2        1.74 ±  6%  perf-profile.calltrace.cycles-pp.schedule.io_schedule.folio_wait_bit_common.shmem_get_folio_gfp.shmem_fault
      0.86            +0.1        1.00 ±  2%  perf-profile.calltrace.cycles-pp.folio_add_file_rmap_range.set_pte_range.filemap_map_pages.do_read_fault.do_fault
      0.56            +0.2        0.72 ±  4%  perf-profile.calltrace.cycles-pp.sched_ttwu_pending.__flush_smp_call_function_queue.flush_smp_call_function_queue.do_idle.cpu_startup_entry
      1.16 ±  3%      +0.2        1.33 ±  2%  perf-profile.calltrace.cycles-pp.set_pte_range.filemap_map_pages.do_read_fault.do_fault.__handle_mm_fault
      0.71 ±  2%      +0.2        0.92 ±  3%  perf-profile.calltrace.cycles-pp.__flush_smp_call_function_queue.flush_smp_call_function_queue.do_idle.cpu_startup_entry.start_secondary
      0.78            +0.2        1.02 ±  4%  perf-profile.calltrace.cycles-pp.flush_smp_call_function_queue.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
      0.44 ± 44%      +0.3        0.73 ±  3%  perf-profile.calltrace.cycles-pp.set_pte_range.finish_fault.do_read_fault.do_fault.__handle_mm_fault
      0.89 ±  9%      +0.3        1.24 ±  8%  perf-profile.calltrace.cycles-pp.finish_fault.do_read_fault.do_fault.__handle_mm_fault.handle_mm_fault
      1.23            +0.4        1.59        perf-profile.calltrace.cycles-pp.irqentry_exit_to_user_mode.asm_exc_page_fault.do_access
      0.18 ±141%      +0.4        0.57 ±  5%  perf-profile.calltrace.cycles-pp.try_to_wake_up.wake_page_function.__wake_up_common.folio_wake_bit.filemap_map_pages
      1.50            +0.6        2.05        perf-profile.calltrace.cycles-pp.filemap_get_entry.shmem_get_folio_gfp.shmem_fault.__do_fault.do_read_fault
      0.00            +0.6        0.56 ±  4%  perf-profile.calltrace.cycles-pp.wake_page_function.__wake_up_common.folio_wake_bit.do_read_fault.do_fault
      0.09 ±223%      +0.6        0.69 ±  4%  perf-profile.calltrace.cycles-pp.__wake_up_common.folio_wake_bit.do_read_fault.do_fault.__handle_mm_fault
      0.00            +0.6        0.60        perf-profile.calltrace.cycles-pp.folio_add_file_rmap_range.set_pte_range.finish_fault.do_read_fault.do_fault
      2.98 ±  3%      +0.7        3.66 ±  2%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irq.folio_wait_bit_common.shmem_get_folio_gfp.shmem_fault
      3.39 ±  3%      +0.8        4.21        perf-profile.calltrace.cycles-pp._raw_spin_lock_irq.folio_wait_bit_common.shmem_get_folio_gfp.shmem_fault.__do_fault
      7.48            +0.9        8.41        perf-profile.calltrace.cycles-pp.folio_wait_bit_common.shmem_get_folio_gfp.shmem_fault.__do_fault.do_read_fault
      2.25 ±  6%      +1.0        3.30 ±  3%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.folio_wake_bit.do_read_fault.do_fault
      2.44 ±  5%      +1.1        3.56 ±  2%  perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.folio_wake_bit.do_read_fault.do_fault.__handle_mm_fault
      3.11 ±  4%      +1.4        4.52        perf-profile.calltrace.cycles-pp.folio_wake_bit.do_read_fault.do_fault.__handle_mm_fault.handle_mm_fault
     10.14            +1.9       12.06        perf-profile.calltrace.cycles-pp.shmem_get_folio_gfp.shmem_fault.__do_fault.do_read_fault.do_fault
     10.26            +2.0       12.25        perf-profile.calltrace.cycles-pp.shmem_fault.__do_fault.do_read_fault.do_fault.__handle_mm_fault
     10.29            +2.0       12.29        perf-profile.calltrace.cycles-pp.__do_fault.do_read_fault.do_fault.__handle_mm_fault.handle_mm_fault
      9.69            +5.5       15.21 ±  2%  perf-profile.calltrace.cycles-pp.do_rw_once
     74.66            -6.7       67.94        perf-profile.children.cycles-pp.exc_page_fault
     74.62            -6.7       67.90        perf-profile.children.cycles-pp.do_user_addr_fault
     53.19            -6.3       46.89        perf-profile.children.cycles-pp.filemap_map_pages
     35.56            -6.1       29.44        perf-profile.children.cycles-pp.next_uptodate_folio
     76.51            -6.0       70.48        perf-profile.children.cycles-pp.asm_exc_page_fault
     70.02            -3.8       66.24        perf-profile.children.cycles-pp.__handle_mm_fault
     70.40            -3.7       66.71        perf-profile.children.cycles-pp.handle_mm_fault
     81.33            -3.5       77.78        perf-profile.children.cycles-pp.do_access
     68.32            -2.8       65.52        perf-profile.children.cycles-pp.do_fault
     68.30            -2.8       65.50        perf-profile.children.cycles-pp.do_read_fault
      2.07 ±  7%      -2.0        0.12 ±  6%  perf-profile.children.cycles-pp.down_read_trylock
      1.28 ±  4%      -1.1        0.16 ±  4%  perf-profile.children.cycles-pp.up_read
      0.65 ± 12%      -0.4        0.28 ± 15%  perf-profile.children.cycles-pp.intel_idle_irq
      1.96 ±  6%      -0.2        1.74 ±  6%  perf-profile.children.cycles-pp.schedule
      1.96 ±  6%      -0.2        1.74 ±  6%  perf-profile.children.cycles-pp.io_schedule
      0.36 ±  7%      -0.2        0.15 ±  3%  perf-profile.children.cycles-pp.mtree_range_walk
      0.30 ±  8%      -0.2        0.13 ± 14%  perf-profile.children.cycles-pp.mm_cid_get
      0.12 ± 12%      -0.1        0.03 ±100%  perf-profile.children.cycles-pp.update_sg_lb_stats
      0.16 ±  9%      -0.1        0.07 ± 15%  perf-profile.children.cycles-pp.load_balance
      0.14 ± 10%      -0.1        0.05 ± 46%  perf-profile.children.cycles-pp.update_sd_lb_stats
      0.20 ± 10%      -0.1        0.11 ±  8%  perf-profile.children.cycles-pp.newidle_balance
      0.14 ± 10%      -0.1        0.06 ± 17%  perf-profile.children.cycles-pp.find_busiest_group
      0.33 ±  6%      -0.0        0.28 ±  5%  perf-profile.children.cycles-pp.pick_next_task_fair
      0.05            +0.0        0.06        perf-profile.children.cycles-pp.nohz_run_idle_balance
      0.06            +0.0        0.08 ±  6%  perf-profile.children.cycles-pp.__update_load_avg_se
      0.04 ± 44%      +0.0        0.06        perf-profile.children.cycles-pp.reweight_entity
      0.09 ±  7%      +0.0        0.11 ±  4%  perf-profile.children.cycles-pp.xas_descend
      0.08 ±  5%      +0.0        0.10 ±  4%  perf-profile.children.cycles-pp.update_curr
      0.09 ±  7%      +0.0        0.11 ±  3%  perf-profile.children.cycles-pp.prepare_task_switch
      0.10 ±  4%      +0.0        0.12 ±  3%  perf-profile.children.cycles-pp.call_function_single_prep_ipi
      0.08 ±  4%      +0.0        0.10 ±  5%  perf-profile.children.cycles-pp.__update_load_avg_cfs_rq
      0.04 ± 44%      +0.0        0.06 ±  7%  perf-profile.children.cycles-pp.sched_clock
      0.13 ±  7%      +0.0        0.16 ±  4%  perf-profile.children.cycles-pp.__sysvec_call_function_single
      0.08 ±  6%      +0.0        0.10 ±  3%  perf-profile.children.cycles-pp.set_next_entity
      0.16 ±  4%      +0.0        0.19 ±  3%  perf-profile.children.cycles-pp.__switch_to
      0.09 ±  4%      +0.0        0.12 ±  4%  perf-profile.children.cycles-pp.llist_reverse_order
      0.04 ± 44%      +0.0        0.07 ±  5%  perf-profile.children.cycles-pp.place_entity
      0.14 ±  3%      +0.0        0.16 ±  3%  perf-profile.children.cycles-pp.llist_add_batch
      0.09 ±  5%      +0.0        0.12 ±  6%  perf-profile.children.cycles-pp.available_idle_cpu
      0.15 ±  4%      +0.0        0.18 ±  4%  perf-profile.children.cycles-pp.sysvec_call_function_single
      0.08 ±  5%      +0.0        0.12 ±  6%  perf-profile.children.cycles-pp.wake_affine
      0.08            +0.0        0.11        perf-profile.children.cycles-pp.__list_del_entry_valid_or_report
      0.11 ±  4%      +0.0        0.14 ±  3%  perf-profile.children.cycles-pp.update_rq_clock_task
      0.11 ±  4%      +0.0        0.14 ±  4%  perf-profile.children.cycles-pp.__switch_to_asm
      0.04 ± 44%      +0.0        0.07 ±  6%  perf-profile.children.cycles-pp.folio_add_lru
      0.06 ±  7%      +0.0        0.10 ±  6%  perf-profile.children.cycles-pp.shmem_add_to_page_cache
      0.18 ±  5%      +0.0        0.22 ±  4%  perf-profile.children.cycles-pp.asm_sysvec_call_function_single
      0.02 ±141%      +0.0        0.06 ±  6%  perf-profile.children.cycles-pp.tick_nohz_idle_exit
      0.12 ±  3%      +0.0        0.17 ±  5%  perf-profile.children.cycles-pp.select_task_rq_fair
      0.13 ±  3%      +0.0        0.18 ±  6%  perf-profile.children.cycles-pp.select_task_rq
      0.23 ±  3%      +0.1        0.29 ±  3%  perf-profile.children.cycles-pp.__smp_call_single_queue
      0.20 ±  3%      +0.1        0.26 ±  3%  perf-profile.children.cycles-pp.update_load_avg
      0.01 ±223%      +0.1        0.07 ± 18%  perf-profile.children.cycles-pp.shmem_alloc_and_acct_folio
      0.26 ±  2%      +0.1        0.34 ±  3%  perf-profile.children.cycles-pp.dequeue_entity
      0.29 ±  3%      +0.1        0.37 ±  4%  perf-profile.children.cycles-pp.dequeue_task_fair
      0.17 ±  3%      +0.1        0.26 ±  2%  perf-profile.children.cycles-pp.sync_regs
      0.34 ±  2%      +0.1        0.42 ±  4%  perf-profile.children.cycles-pp.ttwu_queue_wakelist
      0.28 ±  3%      +0.1        0.37 ±  4%  perf-profile.children.cycles-pp.enqueue_entity
      0.28 ±  3%      +0.1        0.38 ±  6%  perf-profile.children.cycles-pp.__perf_sw_event
      0.32 ±  2%      +0.1        0.42 ±  5%  perf-profile.children.cycles-pp.___perf_sw_event
      0.34 ±  3%      +0.1        0.44 ±  4%  perf-profile.children.cycles-pp.enqueue_task_fair
      0.36 ±  2%      +0.1        0.46 ±  3%  perf-profile.children.cycles-pp.activate_task
      0.24 ±  2%      +0.1        0.35        perf-profile.children.cycles-pp.native_irq_return_iret
      0.30 ±  6%      +0.1        0.42 ± 10%  perf-profile.children.cycles-pp.xas_load
      0.31            +0.1        0.43 ±  3%  perf-profile.children.cycles-pp.folio_unlock
      0.44 ±  2%      +0.1        0.56 ±  4%  perf-profile.children.cycles-pp.ttwu_do_activate
      0.40 ±  6%      +0.2        0.56 ±  5%  perf-profile.children.cycles-pp._compound_head
      1.52            +0.2        1.68 ±  4%  perf-profile.children.cycles-pp.wake_page_function
      0.68 ±  3%      +0.2        0.86 ±  4%  perf-profile.children.cycles-pp.try_to_wake_up
      0.66 ±  2%      +0.2        0.84 ±  3%  perf-profile.children.cycles-pp.sched_ttwu_pending
      0.85 ±  2%      +0.2        1.09 ±  3%  perf-profile.children.cycles-pp.__flush_smp_call_function_queue
      0.79 ±  2%      +0.2        1.03 ±  4%  perf-profile.children.cycles-pp.flush_smp_call_function_queue
      1.83            +0.3        2.08 ±  4%  perf-profile.children.cycles-pp.__wake_up_common
      1.29            +0.3        1.60        perf-profile.children.cycles-pp.folio_add_file_rmap_range
      0.89 ±  9%      +0.4        1.24 ±  8%  perf-profile.children.cycles-pp.finish_fault
      1.24            +0.4        1.60        perf-profile.children.cycles-pp.irqentry_exit_to_user_mode
      1.68 ±  3%      +0.4        2.06 ±  2%  perf-profile.children.cycles-pp.set_pte_range
      1.50            +0.6        2.06        perf-profile.children.cycles-pp.filemap_get_entry
      3.42 ±  3%      +0.8        4.24        perf-profile.children.cycles-pp._raw_spin_lock_irq
      7.48            +0.9        8.41        perf-profile.children.cycles-pp.folio_wait_bit_common
      9.67 ±  4%      +1.4       11.07 ±  2%  perf-profile.children.cycles-pp._raw_spin_lock_irqsave
     12.08 ±  3%      +1.8       13.84        perf-profile.children.cycles-pp.folio_wake_bit
     10.15            +1.9       12.07        perf-profile.children.cycles-pp.shmem_get_folio_gfp
     11.80 ±  4%      +1.9       13.74 ±  2%  perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
     10.26            +2.0       12.25        perf-profile.children.cycles-pp.shmem_fault
     10.29            +2.0       12.29        perf-profile.children.cycles-pp.__do_fault
      8.59            +5.3       13.94 ±  2%  perf-profile.children.cycles-pp.do_rw_once
     35.10            -6.1       28.98 ±  2%  perf-profile.self.cycles-pp.next_uptodate_folio
      2.06 ±  7%      -1.9        0.11 ±  4%  perf-profile.self.cycles-pp.down_read_trylock
      1.28 ±  4%      -1.1        0.16 ±  3%  perf-profile.self.cycles-pp.up_read
      1.66 ±  6%      -1.0        0.68 ±  3%  perf-profile.self.cycles-pp.__handle_mm_fault
      7.20            -0.7        6.55        perf-profile.self.cycles-pp.filemap_map_pages
      0.64 ± 12%      -0.4        0.28 ± 15%  perf-profile.self.cycles-pp.intel_idle_irq
      0.36 ±  7%      -0.2        0.15        perf-profile.self.cycles-pp.mtree_range_walk
      0.30 ±  8%      -0.2        0.13 ± 14%  perf-profile.self.cycles-pp.mm_cid_get
      0.71 ±  8%      -0.1        0.59 ±  7%  perf-profile.self.cycles-pp.__schedule
      0.05 ±  8%      +0.0        0.06 ±  7%  perf-profile.self.cycles-pp.ttwu_do_activate
      0.08 ±  5%      +0.0        0.10 ±  4%  perf-profile.self.cycles-pp.do_idle
      0.06 ±  6%      +0.0        0.08 ±  6%  perf-profile.self.cycles-pp.enqueue_task_fair
      0.05 ±  8%      +0.0        0.07 ±  8%  perf-profile.self.cycles-pp.__update_load_avg_se
      0.09 ±  5%      +0.0        0.10 ±  4%  perf-profile.self.cycles-pp.xas_descend
      0.04 ± 44%      +0.0        0.06        perf-profile.self.cycles-pp.reweight_entity
      0.05 ±  7%      +0.0        0.07 ±  9%  perf-profile.self.cycles-pp.set_pte_range
      0.08 ±  6%      +0.0        0.10 ±  5%  perf-profile.self.cycles-pp.update_load_avg
      0.10 ±  4%      +0.0        0.12 ±  3%  perf-profile.self.cycles-pp.call_function_single_prep_ipi
      0.07 ±  5%      +0.0        0.09 ±  5%  perf-profile.self.cycles-pp.__update_load_avg_cfs_rq
      0.08 ±  6%      +0.0        0.10 ±  6%  perf-profile.self.cycles-pp.flush_smp_call_function_queue
      0.10 ±  4%      +0.0        0.13 ±  2%  perf-profile.self.cycles-pp.__flush_smp_call_function_queue
      0.16 ±  4%      +0.0        0.19 ±  3%  perf-profile.self.cycles-pp.__switch_to
      0.14 ±  3%      +0.0        0.16 ±  3%  perf-profile.self.cycles-pp.llist_add_batch
      0.09 ±  5%      +0.0        0.12 ±  6%  perf-profile.self.cycles-pp.available_idle_cpu
      0.08 ±  5%      +0.0        0.12 ±  6%  perf-profile.self.cycles-pp.enqueue_entity
      0.08 ±  5%      +0.0        0.12 ±  4%  perf-profile.self.cycles-pp.llist_reverse_order
      0.10 ±  4%      +0.0        0.13 ±  3%  perf-profile.self.cycles-pp.update_rq_clock_task
      0.08            +0.0        0.11        perf-profile.self.cycles-pp.__list_del_entry_valid_or_report
      0.11 ±  4%      +0.0        0.14 ±  4%  perf-profile.self.cycles-pp.__switch_to_asm
      0.09 ±  5%      +0.0        0.12 ±  8%  perf-profile.self.cycles-pp.ttwu_queue_wakelist
      0.12 ±  4%      +0.0        0.16 ±  6%  perf-profile.self.cycles-pp.xas_load
      0.00            +0.1        0.05        perf-profile.self.cycles-pp.sched_ttwu_pending
      0.00            +0.1        0.06        perf-profile.self.cycles-pp.asm_exc_page_fault
      0.11 ±  4%      +0.1        0.18 ±  4%  perf-profile.self.cycles-pp.shmem_fault
      0.17 ±  3%      +0.1        0.26 ±  2%  perf-profile.self.cycles-pp.sync_regs
      0.31 ±  2%      +0.1        0.40 ±  5%  perf-profile.self.cycles-pp.___perf_sw_event
      0.31 ±  2%      +0.1        0.40 ±  3%  perf-profile.self.cycles-pp.__wake_up_common
      0.24 ±  2%      +0.1        0.35        perf-profile.self.cycles-pp.native_irq_return_iret
      0.31            +0.1        0.43 ±  3%  perf-profile.self.cycles-pp.folio_unlock
      0.44 ±  3%      +0.1        0.57 ±  2%  perf-profile.self.cycles-pp._raw_spin_lock_irq
      0.68 ±  3%      +0.1        0.83 ±  2%  perf-profile.self.cycles-pp.folio_wake_bit
      0.85            +0.2        1.00 ±  3%  perf-profile.self.cycles-pp._raw_spin_lock_irqsave
      0.40 ±  5%      +0.2        0.56 ±  5%  perf-profile.self.cycles-pp._compound_head
      1.29            +0.3        1.59        perf-profile.self.cycles-pp.folio_add_file_rmap_range
      0.99            +0.3        1.30 ±  2%  perf-profile.self.cycles-pp.shmem_get_folio_gfp
      2.08            +0.3        2.39 ±  2%  perf-profile.self.cycles-pp.folio_wait_bit_common
      1.18            +0.4        1.55        perf-profile.self.cycles-pp.irqentry_exit_to_user_mode
      1.43            +0.5        1.90        perf-profile.self.cycles-pp.filemap_get_entry
      3.93            +1.9        5.85        perf-profile.self.cycles-pp.do_access
     11.80 ±  4%      +1.9       13.74 ±  2%  perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
      6.55            +4.5       11.08 ±  2%  perf-profile.self.cycles-pp.do_rw_once




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux