[linus:master] [mm] 249608ee47: will-it-scale.per_thread_ops 50.1% improvement

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




Hello,

kernel test robot noticed a 50.1% improvement of will-it-scale.per_thread_ops on:


commit: 249608ee47132cab3b1adacd9e463548f57bd316 ("mm: respect mmap hint address when aligning for THP")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master


testcase: will-it-scale
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 104 threads 2 sockets (Skylake) with 192G memory
parameters:

	nr_task: 100%
	mode: thread
	test: brk1
	cpufreq_governor: performance


In addition to that, the commit also has significant impact on the following tests:

+------------------+---------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_thread_ops 51.6% improvement |
| test machine     | 104 threads 2 sockets (Skylake) with 192G memory              |
| test parameters  | cpufreq_governor=performance                                  |
|                  | mode=thread                                                   |
|                  | nr_task=100%                                                  |
|                  | test=brk2                                                     |
+------------------+---------------------------------------------------------------+




Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20241212/202412122346.ea54d461-lkp@xxxxxxxxx

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-12/performance/x86_64-rhel-9.4/thread/100%/debian-12-x86_64-20240206.cgz/lkp-skl-fpga01/brk1/will-it-scale

commit: 
  89dd878282 ("mm: memcg: declare do_memsw_account inline")
  249608ee47 ("mm: respect mmap hint address when aligning for THP")

89dd878282881306 249608ee47132cab3b1adacd9e4 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
 3.271e+09 ± 11%     -23.6%  2.499e+09 ±  4%  cpuidle..time
    534782 ±  3%      -9.8%     482625        meminfo.Shmem
      7292 ± 10%     -16.8%       6068        uptime.idle
    117230            +3.0%     120705        vmstat.system.in
     10.21 ± 10%      -2.5        7.74 ±  4%  mpstat.cpu.all.idle%
      0.10            -0.0        0.08        mpstat.cpu.all.soft%
      0.30 ±  8%      +0.1        0.38 ±  2%  mpstat.cpu.all.usr%
   1562083 ±  5%     -28.9%    1111214 ±  6%  numa-numastat.node0.local_node
   1600171 ±  5%     -27.1%    1165935 ±  5%  numa-numastat.node0.numa_hit
   2469533 ±  5%     -36.7%    1562269 ±  7%  numa-numastat.node1.local_node
   2538689 ±  5%     -36.4%    1615104 ±  7%  numa-numastat.node1.numa_hit
   1599764 ±  5%     -27.2%    1165290 ±  5%  numa-vmstat.node0.numa_hit
   1561676 ±  5%     -28.9%    1110570 ±  6%  numa-vmstat.node0.numa_local
   2537854 ±  5%     -36.4%    1613883 ±  7%  numa-vmstat.node1.numa_hit
   2468697 ±  5%     -36.8%    1561112 ±  7%  numa-vmstat.node1.numa_local
    517.00 ±  6%     +44.8%     748.67 ±  5%  perf-c2c.DRAM.local
      5599 ±  3%     +22.8%       6877 ±  3%  perf-c2c.DRAM.remote
      5356 ±  2%     +17.2%       6277 ±  4%  perf-c2c.HITM.local
      3995 ±  3%     +12.9%       4512 ±  2%  perf-c2c.HITM.remote
    207757 ±  3%     +50.1%     311758 ±  4%  will-it-scale.104.threads
      9.27 ±  4%     -19.6%       7.45 ±  4%  will-it-scale.104.threads_idle
      1997 ±  3%     +50.1%       2997 ±  4%  will-it-scale.per_thread_ops
    207757 ±  3%     +50.1%     311758 ±  4%  will-it-scale.workload
  20771245 ±  7%     +19.8%   24875862 ±  5%  sched_debug.cfs_rq:/.avg_vruntime.avg
   6013540 ±  9%     +29.6%    7795227 ± 15%  sched_debug.cfs_rq:/.avg_vruntime.stddev
  20771245 ±  7%     +19.8%   24875862 ±  5%  sched_debug.cfs_rq:/.min_vruntime.avg
   6013540 ±  9%     +29.6%    7795227 ± 15%  sched_debug.cfs_rq:/.min_vruntime.stddev
      5286 ±  5%     -32.3%       3580 ±  9%  sched_debug.cpu.avg_idle.min
    304791            -4.4%     291399        proc-vmstat.nr_active_anon
   1009858            -1.3%     996889        proc-vmstat.nr_file_pages
     23935            -4.3%      22912        proc-vmstat.nr_mapped
    133626 ±  3%      -9.7%     120653        proc-vmstat.nr_shmem
    108257            -1.7%     106463        proc-vmstat.nr_slab_unreclaimable
    304791            -4.4%     291399        proc-vmstat.nr_zone_active_anon
   4140560           -32.8%    2781620 ±  2%  proc-vmstat.numa_hit
   4033316           -33.7%    2674065 ±  2%  proc-vmstat.numa_local
   7314624 ±  2%     -37.7%    4554492 ±  3%  proc-vmstat.pgalloc_normal
   1102175            -2.4%    1075842        proc-vmstat.pgfault
   7136742 ±  2%     -38.5%    4391328 ±  3%  proc-vmstat.pgfree
      0.49 ±  6%     +23.1%       0.60 ±  6%  perf-stat.i.MPKI
     37.67            +4.2       41.92        perf-stat.i.cache-miss-rate%
  13495545 ±  3%     +26.4%   17064915 ±  6%  perf-stat.i.cache-misses
  36075782 ±  2%     +14.0%   41135363 ±  5%  perf-stat.i.cache-references
      9.29            +2.5%       9.52        perf-stat.i.cpi
 2.621e+11            +2.5%  2.685e+11        perf-stat.i.cpu-cycles
    212.81            -1.4%     209.80        perf-stat.i.cpu-migrations
     19736 ±  4%     -19.1%      15958 ±  7%  perf-stat.i.cycles-between-cache-misses
      0.11 ±  2%      -3.3%       0.11        perf-stat.i.ipc
      0.48 ±  4%     +25.9%       0.60 ±  6%  perf-stat.overall.MPKI
     37.35            +4.0       41.40        perf-stat.overall.cache-miss-rate%
      9.33            +2.0%       9.52        perf-stat.overall.cpi
     19440 ±  3%     -18.7%      15809 ±  7%  perf-stat.overall.cycles-between-cache-misses
      0.11            -2.0%       0.11        perf-stat.overall.ipc
  40994713 ±  3%     -33.4%   27301203 ±  4%  perf-stat.overall.path-length
  13453027 ±  3%     +26.4%   17009626 ±  6%  perf-stat.ps.cache-misses
  36008186 ±  2%     +14.0%   41056969 ±  5%  perf-stat.ps.cache-references
 2.612e+11            +2.5%  2.676e+11        perf-stat.ps.cpu-cycles
    212.16            -1.4%     209.13        perf-stat.ps.cpu-migrations
      0.00 ±143%    +614.3%       0.01 ± 38%  perf-sched.sch_delay.avg.ms.__cond_resched.__alloc_pages_noprof.alloc_pages_mpol_noprof.folio_alloc_mpol_noprof.vma_alloc_folio_noprof
      0.00 ±223%  +12311.1%       0.19 ±115%  perf-sched.sch_delay.avg.ms.__cond_resched.__wait_for_common.wait_for_completion_state.kernel_clone.__x64_sys_vfork
      0.00         +2575.0%       0.05 ± 92%  perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.vms_gather_munmap_vmas
      0.04 ±175%    +275.8%       0.15 ± 89%  perf-sched.sch_delay.avg.ms.__x64_sys_pause.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
      0.02 ±120%    +669.0%       0.15 ± 89%  perf-sched.sch_delay.avg.ms.do_nanosleep.hrtimer_nanosleep.common_nsleep.__x64_sys_clock_nanosleep
      0.01 ± 32%    +657.1%       0.07 ± 51%  perf-sched.sch_delay.avg.ms.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.15 ±114%    +559.8%       1.00 ± 19%  perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
      0.00 ± 55%    +229.2%       0.01 ± 22%  perf-sched.sch_delay.avg.ms.pipe_read.vfs_read.ksys_read.do_syscall_64
      0.04 ± 61%    +378.2%       0.19 ± 15%  perf-sched.sch_delay.avg.ms.schedule_hrtimeout_range.do_poll.constprop.0.do_sys_poll
      0.01 ± 15%    +160.3%       0.03 ±109%  perf-sched.sch_delay.avg.ms.schedule_hrtimeout_range.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
      0.01 ± 30%    +216.1%       0.02 ± 12%  perf-sched.sch_delay.avg.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write_killable.__do_sys_brk
      0.03 ±163%    +448.7%       0.18 ± 24%  perf-sched.sch_delay.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
      0.01 ± 30%     +96.7%       0.02 ± 11%  perf-sched.sch_delay.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
      0.01 ± 86%    +234.6%       0.05 ± 60%  perf-sched.sch_delay.avg.ms.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
      0.00 ±143%    +700.0%       0.01 ± 33%  perf-sched.sch_delay.max.ms.__cond_resched.__alloc_pages_noprof.alloc_pages_mpol_noprof.folio_alloc_mpol_noprof.vma_alloc_folio_noprof
      0.00 ±223%  +50788.9%       0.76 ±137%  perf-sched.sch_delay.max.ms.__cond_resched.__wait_for_common.wait_for_completion_state.kernel_clone.__x64_sys_vfork
      1.05 ±141%    +326.0%       4.46 ± 67%  perf-sched.sch_delay.max.ms.__cond_resched.down_write.vma_expand.vma_merge_new_range.do_brk_flags
      0.60 ±186%    +271.1%       2.25 ± 74%  perf-sched.sch_delay.max.ms.__cond_resched.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write
      0.02 ± 97%  +14710.9%       2.72 ± 47%  perf-sched.sch_delay.max.ms.__cond_resched.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.vms_gather_munmap_vmas
      0.17 ±208%    +228.7%       0.54 ± 80%  perf-sched.sch_delay.max.ms.__x64_sys_pause.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
      0.10 ±150%   +2829.8%       2.93 ± 34%  perf-sched.sch_delay.max.ms.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
      1.73 ± 99%    +137.5%       4.10 ±  5%  perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
      0.05 ±162%   +3038.5%       1.62 ± 72%  perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_call_function_single.[unknown]
      0.18 ±174%   +1759.9%       3.30 ± 41%  perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown]
      2.19 ± 69%     +74.8%       3.82 ±  6%  perf-sched.sch_delay.max.ms.schedule_hrtimeout_range.do_poll.constprop.0.do_sys_poll
      1.16 ± 95%    +211.8%       3.61 ±  8%  perf-sched.sch_delay.max.ms.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
      0.01 ± 25%    +200.0%       0.02 ± 11%  perf-sched.total_sch_delay.average.ms
      5.20 ±  7%     +55.1%       8.06 ±  7%  perf-sched.total_wait_and_delay.average.ms
    338197 ±  7%     -43.5%     190977 ±  7%  perf-sched.total_wait_and_delay.count.ms
      5.19 ±  7%     +54.9%       8.04 ±  7%  perf-sched.total_wait_time.average.ms
      6.72 ±  6%    -100.0%       0.00        perf-sched.wait_and_delay.avg.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
     70.88 ±162%    +311.9%     292.00 ± 22%  perf-sched.wait_and_delay.avg.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      0.91 ± 15%     -43.6%       0.51 ±  3%  perf-sched.wait_and_delay.avg.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write_killable.__do_sys_brk
    279.25 ± 11%     +24.7%     348.09 ±  5%  perf-sched.wait_and_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
    607.00 ±  6%    -100.0%       0.00        perf-sched.wait_and_delay.count.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
    328796 ±  8%     -45.0%     180683 ±  7%  perf-sched.wait_and_delay.count.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write_killable.__do_sys_brk
      3211 ±  6%     -20.9%       2541 ±  7%  perf-sched.wait_and_delay.count.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      1001          -100.0%       0.00        perf-sched.wait_and_delay.max.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
      0.00 ±223%  +52555.6%       0.79 ± 31%  perf-sched.wait_time.avg.ms.__cond_resched.__wait_for_common.wait_for_completion_state.kernel_clone.__x64_sys_vfork
      0.00 ±142%  +1.2e+05%       1.79 ± 90%  perf-sched.wait_time.avg.ms.__cond_resched.down_write.vma_prepare.commit_merge.vma_expand
     70.88 ±162%    +312.0%     291.99 ± 22%  perf-sched.wait_time.avg.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      0.91 ± 16%     -45.1%       0.50 ±  3%  perf-sched.wait_time.avg.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write_killable.__do_sys_brk
      0.98 ± 11%     +43.4%       1.40 ± 25%  perf-sched.wait_time.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
    279.22 ± 11%     +24.7%     348.08 ±  5%  perf-sched.wait_time.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      0.00 ±223%  +1.5e+05%       2.21 ± 63%  perf-sched.wait_time.max.ms.__cond_resched.__wait_for_common.wait_for_completion_state.kernel_clone.__x64_sys_vfork
      0.00 ±145%  +2.2e+05%       3.74 ± 71%  perf-sched.wait_time.max.ms.__cond_resched.down_write.vma_prepare.commit_merge.vma_expand
      0.05 ±161%   +3018.3%       1.62 ± 72%  perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_sysvec_call_function_single.[unknown]
      0.59 ±  3%      -0.3        0.27 ±100%  perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.osq_lock.rwsem_optimistic_spin.rwsem_down_write_slowpath.down_write_killable
      0.57 ±  6%      -0.3        0.26 ±100%  perf-profile.calltrace.cycles-pp.intel_idle_ibrs.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
      1.70 ±  4%      -0.2        1.49 ±  3%  perf-profile.calltrace.cycles-pp.common_startup_64
      1.61 ±  4%      -0.2        1.40 ±  3%  perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry
      1.61 ±  4%      -0.2        1.40 ±  3%  perf-profile.calltrace.cycles-pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary
      1.62 ±  4%      -0.2        1.42 ±  3%  perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary.common_startup_64
      1.68 ±  4%      -0.2        1.47 ±  3%  perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.common_startup_64
      1.68 ±  4%      -0.2        1.48 ±  3%  perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.common_startup_64
      1.68 ±  4%      -0.2        1.48 ±  3%  perf-profile.calltrace.cycles-pp.start_secondary.common_startup_64
      0.72            -0.1        0.58 ±  2%  perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.__do_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe.brk
      0.81            -0.1        0.70        perf-profile.calltrace.cycles-pp.rwsem_spin_on_owner.rwsem_optimistic_spin.rwsem_down_write_slowpath.down_write_killable.__do_sys_brk
     97.96            +0.1       98.08        perf-profile.calltrace.cycles-pp.__do_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe.brk
     97.98            +0.1       98.11        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.brk
     96.80            +0.1       96.94        perf-profile.calltrace.cycles-pp.rwsem_optimistic_spin.rwsem_down_write_slowpath.down_write_killable.__do_sys_brk.do_syscall_64
     98.01            +0.1       98.16        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.brk
     96.91            +0.2       97.07        perf-profile.calltrace.cycles-pp.rwsem_down_write_slowpath.down_write_killable.__do_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe
     96.94            +0.2       97.12        perf-profile.calltrace.cycles-pp.down_write_killable.__do_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe.brk
     95.81            +0.2       96.00        perf-profile.calltrace.cycles-pp.osq_lock.rwsem_optimistic_spin.rwsem_down_write_slowpath.down_write_killable.__do_sys_brk
     98.17            +0.2       98.40        perf-profile.calltrace.cycles-pp.brk
      0.00            +0.6        0.59 ±  2%  perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
      0.53 ±  6%      -0.4        0.17 ±  8%  perf-profile.children.cycles-pp.intel_idle_irq
      1.00 ±  4%      -0.3        0.70 ±  3%  perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
      1.70 ±  4%      -0.2        1.49 ±  3%  perf-profile.children.cycles-pp.common_startup_64
      1.70 ±  4%      -0.2        1.49 ±  3%  perf-profile.children.cycles-pp.cpu_startup_entry
      1.63 ±  4%      -0.2        1.42 ±  3%  perf-profile.children.cycles-pp.cpuidle_enter
      1.63 ±  4%      -0.2        1.42 ±  3%  perf-profile.children.cycles-pp.cpuidle_enter_state
      1.64 ±  4%      -0.2        1.43 ±  3%  perf-profile.children.cycles-pp.cpuidle_idle_call
      1.70 ±  4%      -0.2        1.49 ±  3%  perf-profile.children.cycles-pp.do_idle
      1.68 ±  4%      -0.2        1.48 ±  3%  perf-profile.children.cycles-pp.start_secondary
      0.21 ±  2%      -0.2        0.05        perf-profile.children.cycles-pp.mas_store_gfp
      0.72            -0.1        0.58 ±  2%  perf-profile.children.cycles-pp.do_vmi_align_munmap
      0.82            -0.1        0.70        perf-profile.children.cycles-pp.rwsem_spin_on_owner
      0.17 ±  2%      -0.1        0.06 ±  7%  perf-profile.children.cycles-pp.mas_store_prealloc
      0.17 ±  2%      -0.1        0.07 ±  5%  perf-profile.children.cycles-pp.vma_complete
      0.58 ±  6%      -0.1        0.49 ±  9%  perf-profile.children.cycles-pp.intel_idle_ibrs
      0.64 ±  3%      -0.1        0.56 ±  3%  perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
      0.54 ±  3%      -0.1        0.47 ±  4%  perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
      0.54 ±  4%      -0.1        0.47 ±  4%  perf-profile.children.cycles-pp.hrtimer_interrupt
      0.45 ±  3%      -0.1        0.39 ±  4%  perf-profile.children.cycles-pp.__hrtimer_run_queues
      0.41 ±  4%      -0.1        0.36 ±  5%  perf-profile.children.cycles-pp.tick_nohz_handler
      0.35            -0.0        0.31 ±  3%  perf-profile.children.cycles-pp.vms_gather_munmap_vmas
      0.32            -0.0        0.27 ±  3%  perf-profile.children.cycles-pp.__split_vma
      0.36 ±  2%      -0.0        0.31 ±  5%  perf-profile.children.cycles-pp.update_process_times
      0.14 ±  6%      -0.0        0.12 ±  4%  perf-profile.children.cycles-pp.handle_softirqs
      0.23 ±  2%      -0.0        0.20 ±  4%  perf-profile.children.cycles-pp.sched_tick
      0.13 ±  6%      -0.0        0.10 ±  4%  perf-profile.children.cycles-pp.rcu_core
      0.13 ±  5%      -0.0        0.10 ±  4%  perf-profile.children.cycles-pp.rcu_do_batch
      0.15 ±  3%      -0.0        0.12 ±  3%  perf-profile.children.cycles-pp.kmem_cache_free
      0.06 ±  6%      -0.0        0.04 ± 44%  perf-profile.children.cycles-pp._raw_spin_lock_irqsave
      0.06 ± 11%      -0.0        0.05        perf-profile.children.cycles-pp.kthread
      0.06 ± 11%      -0.0        0.05        perf-profile.children.cycles-pp.ret_from_fork
      0.06 ± 11%      -0.0        0.05        perf-profile.children.cycles-pp.ret_from_fork_asm
      0.06 ±  7%      -0.0        0.05        perf-profile.children.cycles-pp.smpboot_thread_fn
      0.06            -0.0        0.05        perf-profile.children.cycles-pp.__slab_free
      0.06 ±  7%      +0.0        0.08 ±  5%  perf-profile.children.cycles-pp.vma_expand
      0.07 ±  7%      +0.0        0.09 ±  4%  perf-profile.children.cycles-pp.syscall_return_via_sysret
      0.08 ±  6%      +0.0        0.10        perf-profile.children.cycles-pp.vma_merge_new_range
      0.06 ±  9%      +0.0        0.08 ±  4%  perf-profile.children.cycles-pp.anon_vma_clone
      0.08 ±  5%      +0.0        0.11 ±  6%  perf-profile.children.cycles-pp.up_write
      0.06 ±  8%      +0.0        0.09 ±  8%  perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
      0.05 ±  7%      +0.0        0.09 ±  7%  perf-profile.children.cycles-pp.kmem_cache_alloc_noprof
      0.08 ±  5%      +0.0        0.12 ±  3%  perf-profile.children.cycles-pp.vms_clear_ptes
      0.12 ±  4%      +0.0        0.16 ±  2%  perf-profile.children.cycles-pp.do_brk_flags
      0.00            +0.1        0.05        perf-profile.children.cycles-pp.unlink_anon_vmas
      0.00            +0.1        0.06 ±  9%  perf-profile.children.cycles-pp.entry_SYSCALL_64
      0.00            +0.1        0.06 ±  8%  perf-profile.children.cycles-pp.__memcg_slab_post_alloc_hook
      0.00            +0.1        0.06 ±  8%  perf-profile.children.cycles-pp.vm_area_dup
      0.00            +0.1        0.06        perf-profile.children.cycles-pp.free_pgtables
      0.16 ±  4%      +0.1        0.22 ±  3%  perf-profile.children.cycles-pp.vms_complete_munmap_vmas
      0.00            +0.1        0.07 ±  5%  perf-profile.children.cycles-pp.mas_wr_node_store
      0.00            +0.1        0.11 ±  4%  perf-profile.children.cycles-pp.poll_idle
     97.96            +0.1       98.08        perf-profile.children.cycles-pp.__do_sys_brk
     98.02            +0.1       98.14        perf-profile.children.cycles-pp.do_syscall_64
     96.80            +0.1       96.94        perf-profile.children.cycles-pp.rwsem_optimistic_spin
     98.05            +0.1       98.19        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
      0.45 ±  4%      +0.2        0.60 ±  2%  perf-profile.children.cycles-pp.intel_idle
     96.91            +0.2       97.07        perf-profile.children.cycles-pp.rwsem_down_write_slowpath
     96.94            +0.2       97.12        perf-profile.children.cycles-pp.down_write_killable
     95.84            +0.2       96.02        perf-profile.children.cycles-pp.osq_lock
     98.18            +0.2       98.40        perf-profile.children.cycles-pp.brk
      0.50 ±  6%      -0.3        0.16 ±  9%  perf-profile.self.cycles-pp.intel_idle_irq
      0.81            -0.1        0.70        perf-profile.self.cycles-pp.rwsem_spin_on_owner
      0.58 ±  6%      -0.1        0.49 ±  9%  perf-profile.self.cycles-pp.intel_idle_ibrs
      0.06 ±  8%      +0.0        0.08 ±  5%  perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
      0.06 ±  7%      +0.0        0.09 ±  4%  perf-profile.self.cycles-pp.syscall_return_via_sysret
      0.00            +0.1        0.05        perf-profile.self.cycles-pp.entry_SYSCALL_64
      0.00            +0.1        0.05 ±  7%  perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
      0.13 ±  2%      +0.1        0.18 ±  2%  perf-profile.self.cycles-pp.rwsem_optimistic_spin
      0.00            +0.1        0.06 ±  6%  perf-profile.self.cycles-pp.up_write
      0.00            +0.1        0.11 ±  4%  perf-profile.self.cycles-pp.poll_idle
      0.45 ±  4%      +0.2        0.60 ±  2%  perf-profile.self.cycles-pp.intel_idle
     95.28            +0.3       95.53        perf-profile.self.cycles-pp.osq_lock


***************************************************************************************************
lkp-skl-fpga01: 104 threads 2 sockets (Skylake) with 192G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-12/performance/x86_64-rhel-9.4/thread/100%/debian-12-x86_64-20240206.cgz/lkp-skl-fpga01/brk2/will-it-scale

commit: 
  89dd878282 ("mm: memcg: declare do_memsw_account inline")
  249608ee47 ("mm: respect mmap hint address when aligning for THP")

89dd878282881306 249608ee47132cab3b1adacd9e4 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
 3.415e+09 ±  5%     -18.3%  2.791e+09 ±  8%  cpuidle..time
    117810            +2.1%     120255        vmstat.system.in
     10.66 ±  4%      -2.0        8.69 ±  8%  mpstat.cpu.all.idle%
      0.10            -0.0        0.08 ±  2%  mpstat.cpu.all.soft%
      0.31            +0.1        0.37 ±  2%  mpstat.cpu.all.usr%
   1679216 ±  5%     -30.5%    1166751 ±  9%  numa-numastat.node0.local_node
   1728543 ±  4%     -29.7%    1214908 ±  8%  numa-numastat.node0.numa_hit
   2318360 ±  3%     -30.9%    1600917 ±  6%  numa-numastat.node1.local_node
   2376686 ±  2%     -30.1%    1660471 ±  5%  numa-numastat.node1.numa_hit
   1726631 ±  4%     -29.7%    1214257 ±  8%  numa-vmstat.node0.numa_hit
   1677304 ±  5%     -30.5%    1166100 ±  9%  numa-vmstat.node0.numa_local
   2374815 ±  2%     -30.1%    1659314 ±  5%  numa-vmstat.node1.numa_hit
   2316489 ±  3%     -30.9%    1599760 ±  6%  numa-vmstat.node1.numa_local
    198860           +51.6%     301493 ±  2%  will-it-scale.104.threads
     10.10           -22.5%       7.82 ±  2%  will-it-scale.104.threads_idle
      1911           +51.6%       2898 ±  2%  will-it-scale.per_thread_ops
    198860           +51.6%     301493 ±  2%  will-it-scale.workload
    506.67 ±  6%     +50.9%     764.67 ±  3%  perf-c2c.DRAM.local
      5447           +27.1%       6925 ±  3%  perf-c2c.DRAM.remote
      5367 ±  2%     +18.6%       6364        perf-c2c.HITM.local
      3830           +17.8%       4513 ±  3%  perf-c2c.HITM.remote
      9197           +18.3%      10877 ±  2%  perf-c2c.HITM.total
     23736            -1.8%      23303        proc-vmstat.nr_mapped
    108712            -2.0%     106548        proc-vmstat.nr_slab_unreclaimable
   4105528           -30.0%    2875907        proc-vmstat.numa_hit
   3997875           -30.8%    2768196        proc-vmstat.numa_local
    236448 ± 14%     -25.0%     177254 ± 12%  proc-vmstat.numa_pte_updates
   7242851           -34.3%    4757136        proc-vmstat.pgalloc_normal
   7071106           -35.1%    4589946        proc-vmstat.pgfree
  19917807 ±  2%     +24.3%   24752419 ±  3%  sched_debug.cfs_rq:/.avg_vruntime.avg
  38832674 ±  6%     +31.8%   51167079 ±  8%  sched_debug.cfs_rq:/.avg_vruntime.max
   5538759 ±  3%     +56.3%    8659607 ± 16%  sched_debug.cfs_rq:/.avg_vruntime.stddev
  19917807 ±  2%     +24.3%   24752418 ±  3%  sched_debug.cfs_rq:/.min_vruntime.avg
  38832674 ±  6%     +31.8%   51167093 ±  8%  sched_debug.cfs_rq:/.min_vruntime.max
   5538759 ±  3%     +56.3%    8659606 ± 16%  sched_debug.cfs_rq:/.min_vruntime.stddev
    894.81 ±  7%     +11.9%       1001 ±  8%  sched_debug.cfs_rq:/.util_est.max
      5560 ±  6%     -40.7%       3294 ±  3%  sched_debug.cpu.avg_idle.min
      0.52 ±  3%     +21.7%       0.63 ±  3%  perf-stat.i.MPKI
  17623556            -6.6%   16458641 ±  3%  perf-stat.i.branch-misses
     37.96            +3.6       41.59        perf-stat.i.cache-miss-rate%
  14340737 ±  3%     +22.2%   17528616 ±  2%  perf-stat.i.cache-misses
  38069590 ±  2%     +11.5%   42445235 ±  2%  perf-stat.i.cache-references
      9.24            +2.6%       9.48        perf-stat.i.cpi
 2.602e+11            +2.4%  2.665e+11        perf-stat.i.cpu-cycles
     18443 ±  3%     -17.1%      15286 ±  2%  perf-stat.i.cycles-between-cache-misses
      0.51 ±  2%     +22.2%       0.63 ±  2%  perf-stat.overall.MPKI
      0.32            -0.0        0.29 ±  2%  perf-stat.overall.branch-miss-rate%
     37.63            +3.6       41.25        perf-stat.overall.cache-miss-rate%
      9.28            +2.4%       9.50        perf-stat.overall.cpi
     18154 ±  2%     -16.2%      15205 ±  2%  perf-stat.overall.cycles-between-cache-misses
      0.11            -2.3%       0.11        perf-stat.overall.ipc
  42574383           -33.8%   28187632 ±  2%  perf-stat.overall.path-length
  17580646            -6.7%   16398374 ±  3%  perf-stat.ps.branch-misses
  14294844 ±  3%     +22.2%   17469729 ±  2%  perf-stat.ps.cache-misses
  37981661 ±  2%     +11.5%   42347645 ±  2%  perf-stat.ps.cache-references
 2.593e+11            +2.4%  2.655e+11        perf-stat.ps.cpu-cycles
      0.00 ±147%    +500.0%       0.01 ± 14%  perf-sched.sch_delay.avg.ms.__cond_resched.__alloc_pages_noprof.alloc_pages_mpol_noprof.folio_alloc_mpol_noprof.vma_alloc_folio_noprof
      0.11 ±  8%     -32.5%       0.08 ± 23%  perf-sched.sch_delay.avg.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
      0.00 ±223%  +10641.7%       0.21 ± 55%  perf-sched.sch_delay.avg.ms.__cond_resched.__wait_for_common.wait_for_completion_state.kernel_clone.__x64_sys_vfork
      0.00 ±179%   +2890.9%       0.05 ± 53%  perf-sched.sch_delay.avg.ms.__cond_resched.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write
      0.01 ±135%    +390.2%       0.07 ±100%  perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_noprof.vm_area_dup.__split_vma.vms_gather_munmap_vmas
      0.00 ±223%   +1475.0%       0.01 ± 71%  perf-sched.sch_delay.avg.ms.__cond_resched.mutex_lock.perf_poll.do_poll.constprop
      0.00 ±223%   +9837.5%       0.13 ±121%  perf-sched.sch_delay.avg.ms.__cond_resched.shmem_inode_acct_blocks.shmem_alloc_and_add_folio.shmem_get_folio_gfp.shmem_write_begin
      0.00 ± 14%   +1830.0%       0.06 ± 97%  perf-sched.sch_delay.avg.ms.__cond_resched.stop_one_cpu.sched_exec.bprm_execve.part
      0.01 ±  8%   +2452.0%       0.21 ± 64%  perf-sched.sch_delay.avg.ms.__x64_sys_pause.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
      0.01 ± 16%    +870.6%       0.08 ± 84%  perf-sched.sch_delay.avg.ms.do_nanosleep.hrtimer_nanosleep.common_nsleep.__x64_sys_clock_nanosleep
      0.01 ±  6%    +823.9%       0.07 ± 31%  perf-sched.sch_delay.avg.ms.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.00 ±100%    +411.1%       0.01 ±  9%  perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_exc_page_fault.[unknown].[unknown]
      0.02 ± 34%   +3178.5%       0.71 ± 32%  perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
      0.01 ± 75%   +1602.7%       0.10 ±143%  perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
      0.12 ±150%     -87.6%       0.02 ± 45%  perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown]
      0.00 ±150%   +1047.1%       0.03 ±105%  perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown].[unknown]
      0.00 ± 30%    +346.7%       0.01 ± 20%  perf-sched.sch_delay.avg.ms.pipe_read.vfs_read.ksys_read.do_syscall_64
      0.02 ± 68%   +1050.0%       0.19 ± 27%  perf-sched.sch_delay.avg.ms.schedule_hrtimeout_range.do_poll.constprop.0.do_sys_poll
      0.01 ± 14%    +376.8%       0.04 ±105%  perf-sched.sch_delay.avg.ms.schedule_hrtimeout_range.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
      0.01 ±  9%    +138.9%       0.01 ± 12%  perf-sched.sch_delay.avg.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write_killable.__do_sys_brk
      0.01         +2033.3%       0.13 ± 33%  perf-sched.sch_delay.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
      0.01 ± 11%    +216.7%       0.03 ± 83%  perf-sched.sch_delay.avg.ms.schedule_timeout.kcompactd.kthread.ret_from_fork
      0.01 ±  5%    +172.1%       0.02 ± 11%  perf-sched.sch_delay.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
      0.01 ± 61%    +173.4%       0.03 ± 46%  perf-sched.sch_delay.avg.ms.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
      0.00 ±147%    +787.5%       0.01 ± 37%  perf-sched.sch_delay.max.ms.__cond_resched.__alloc_pages_noprof.alloc_pages_mpol_noprof.folio_alloc_mpol_noprof.vma_alloc_folio_noprof
      0.03 ±223%   +4840.4%       1.24 ± 64%  perf-sched.sch_delay.max.ms.__cond_resched.__do_fault.do_read_fault.do_pte_missing.__handle_mm_fault
      0.00 ±223%  +41625.0%       0.83 ± 60%  perf-sched.sch_delay.max.ms.__cond_resched.__wait_for_common.wait_for_completion_state.kernel_clone.__x64_sys_vfork
      0.16 ±213%    +813.2%       1.48 ± 78%  perf-sched.sch_delay.max.ms.__cond_resched.down_write.vma_expand.vma_merge_new_range.do_brk_flags
      0.00 ±167%  +43144.0%       1.80 ± 59%  perf-sched.sch_delay.max.ms.__cond_resched.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write
      0.00 ±223%  +22188.9%       0.33 ±216%  perf-sched.sch_delay.max.ms.__cond_resched.mutex_lock.fdget_pos.ksys_write.do_syscall_64
      0.00 ±223%   +2458.3%       0.05 ±154%  perf-sched.sch_delay.max.ms.__cond_resched.mutex_lock.perf_poll.do_poll.constprop
      0.00 ±223%  +68268.8%       1.82 ± 71%  perf-sched.sch_delay.max.ms.__cond_resched.shmem_inode_acct_blocks.shmem_alloc_and_add_folio.shmem_get_folio_gfp.shmem_write_begin
      0.00 ± 11%  +15918.5%       0.72 ±101%  perf-sched.sch_delay.max.ms.__cond_resched.stop_one_cpu.sched_exec.bprm_execve.part
      0.01 ± 12%   +5779.5%       0.72 ± 50%  perf-sched.sch_delay.max.ms.__x64_sys_pause.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
      0.02 ± 53%   +2545.4%       0.48 ± 73%  perf-sched.sch_delay.max.ms.do_nanosleep.hrtimer_nanosleep.common_nsleep.__x64_sys_clock_nanosleep
      0.02 ± 18%  +15675.3%       2.45 ± 11%  perf-sched.sch_delay.max.ms.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.00 ±100%   +1100.0%       0.02 ± 76%  perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_exc_page_fault.[unknown].[unknown]
      0.22 ± 70%   +1725.7%       3.94 ±  4%  perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
      0.01 ± 72%   +3737.3%       0.33 ±114%  perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
      0.00 ±141%  +25095.7%       0.97 ±144%  perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown].[unknown]
      0.58 ± 79%    +423.4%       3.03 ± 43%  perf-sched.sch_delay.max.ms.pipe_read.vfs_read.ksys_read.do_syscall_64
      0.91 ± 75%    +324.0%       3.84 ±  3%  perf-sched.sch_delay.max.ms.schedule_hrtimeout_range.do_poll.constprop.0.do_sys_poll
      0.02 ± 49%  +18885.6%       3.51 ± 21%  perf-sched.sch_delay.max.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
      0.06 ±  5%   +3199.2%       2.01        perf-sched.sch_delay.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
      0.93 ±115%    +238.9%       3.16 ± 52%  perf-sched.sch_delay.max.ms.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
      5.53 ±  3%     +35.2%       7.48 ±  3%  perf-sched.total_wait_and_delay.average.ms
    330090           -37.0%     207837 ±  4%  perf-sched.total_wait_and_delay.count.ms
      5.52 ±  3%     +35.2%       7.46 ±  3%  perf-sched.total_wait_time.average.ms
      6.70 ±  4%    -100.0%       0.00        perf-sched.wait_and_delay.avg.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
    167.82 ± 96%     -92.4%      12.75 ± 78%  perf-sched.wait_and_delay.avg.ms.__cond_resched.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write
      1.20 ±  4%     -58.9%       0.49 ±  4%  perf-sched.wait_and_delay.avg.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write_killable.__do_sys_brk
    280.09 ±  3%     +36.1%     381.15 ±  3%  perf-sched.wait_and_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
    606.50 ±  6%    -100.0%       0.00        perf-sched.wait_and_delay.count.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
    320972           -38.3%     197924 ±  4%  perf-sched.wait_and_delay.count.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write_killable.__do_sys_brk
      3118 ±  2%     -24.6%       2352 ±  2%  perf-sched.wait_and_delay.count.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
    693.67            -9.8%     626.00        perf-sched.wait_and_delay.count.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
      1000          -100.0%       0.00        perf-sched.wait_and_delay.max.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
    167.82 ± 96%     -91.5%      14.30 ± 56%  perf-sched.wait_time.avg.ms.__cond_resched.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write
      0.55 ±223%    +762.9%       4.74 ±117%  perf-sched.wait_time.avg.ms.__cond_resched.mutex_lock.fdget_pos.ksys_write.do_syscall_64
      0.61 ±  3%     +24.0%       0.76 ±  8%  perf-sched.wait_time.avg.ms.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.26 ±221%   +3041.2%       8.22 ±129%  perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown].[unknown]
      1.20 ±  4%     -59.9%       0.48 ±  4%  perf-sched.wait_time.avg.ms.schedule_preempt_disabled.rwsem_down_write_slowpath.down_write_killable.__do_sys_brk
      0.91           +45.7%       1.32 ±  6%  perf-sched.wait_time.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
    280.07 ±  3%     +36.1%     381.13 ±  3%  perf-sched.wait_time.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      0.43 ±223%    +525.8%       2.69 ± 57%  perf-sched.wait_time.max.ms.__cond_resched.__wait_for_common.wait_for_completion_state.kernel_clone.__x64_sys_vfork
      3.29 ±223%   +1258.4%      44.70 ± 98%  perf-sched.wait_time.max.ms.__cond_resched.mutex_lock.fdget_pos.ksys_write.do_syscall_64
     29.75 ±  9%     +42.0%      42.24 ± 16%  perf-sched.wait_time.max.ms.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.52 ±222%  +67466.8%     350.90 ±131%  perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown].[unknown]
      3.60 ±  5%    +106.8%       7.43 ± 11%  perf-sched.wait_time.max.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
      5.04           +36.0%       6.86 ±  4%  perf-sched.wait_time.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread
      1.72 ±  3%      -0.2        1.47 ±  3%  perf-profile.calltrace.cycles-pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary
      1.73 ±  3%      -0.2        1.48 ±  3%  perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary.common_startup_64
      1.72 ±  3%      -0.2        1.47 ±  3%  perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry
      1.82 ±  3%      -0.2        1.57 ±  3%  perf-profile.calltrace.cycles-pp.common_startup_64
      1.80 ±  3%      -0.2        1.56 ±  3%  perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.common_startup_64
      1.80 ±  3%      -0.2        1.56 ±  3%  perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.common_startup_64
      1.80 ±  3%      -0.2        1.56 ±  3%  perf-profile.calltrace.cycles-pp.start_secondary.common_startup_64
      0.63 ±  3%      -0.2        0.43 ± 44%  perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.osq_lock.rwsem_optimistic_spin.rwsem_down_write_slowpath.down_write_killable
      0.73            -0.1        0.59        perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.__do_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe.brk
      0.82            -0.1        0.71        perf-profile.calltrace.cycles-pp.rwsem_spin_on_owner.rwsem_optimistic_spin.rwsem_down_write_slowpath.down_write_killable.__do_sys_brk
      0.63 ±  3%      -0.1        0.54 ±  4%  perf-profile.calltrace.cycles-pp.intel_idle_ibrs.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
     97.85            +0.2       98.02        perf-profile.calltrace.cycles-pp.__do_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe.brk
     97.87            +0.2       98.04        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.brk
     96.68            +0.2       96.85        perf-profile.calltrace.cycles-pp.rwsem_optimistic_spin.rwsem_down_write_slowpath.down_write_killable.__do_sys_brk.do_syscall_64
     97.90            +0.2       98.09        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.brk
     96.79            +0.2       96.99        perf-profile.calltrace.cycles-pp.rwsem_down_write_slowpath.down_write_killable.__do_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe
     96.82            +0.2       97.04        perf-profile.calltrace.cycles-pp.down_write_killable.__do_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe.brk
     95.68            +0.2       95.91        perf-profile.calltrace.cycles-pp.osq_lock.rwsem_optimistic_spin.rwsem_down_write_slowpath.down_write_killable.__do_sys_brk
     98.06            +0.3       98.32        perf-profile.calltrace.cycles-pp.brk
      0.00            +0.6        0.60 ±  3%  perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
      0.56 ±  4%      -0.4        0.16 ±  4%  perf-profile.children.cycles-pp.intel_idle_irq
      1.06 ±  3%      -0.4        0.70 ±  4%  perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
      1.73 ±  3%      -0.2        1.49 ±  3%  perf-profile.children.cycles-pp.cpuidle_enter
      1.73 ±  3%      -0.2        1.49 ±  3%  perf-profile.children.cycles-pp.cpuidle_enter_state
      1.74 ±  3%      -0.2        1.50 ±  3%  perf-profile.children.cycles-pp.cpuidle_idle_call
      1.82 ±  3%      -0.2        1.57 ±  3%  perf-profile.children.cycles-pp.common_startup_64
      1.82 ±  3%      -0.2        1.57 ±  3%  perf-profile.children.cycles-pp.cpu_startup_entry
      1.82 ±  3%      -0.2        1.57 ±  3%  perf-profile.children.cycles-pp.do_idle
      1.80 ±  3%      -0.2        1.56 ±  3%  perf-profile.children.cycles-pp.start_secondary
      0.21            -0.2        0.05 ±  7%  perf-profile.children.cycles-pp.mas_store_gfp
      0.73            -0.1        0.59        perf-profile.children.cycles-pp.do_vmi_align_munmap
      0.69 ±  2%      -0.1        0.56 ±  4%  perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
      0.58 ±  3%      -0.1        0.47 ±  4%  perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
      0.83            -0.1        0.72        perf-profile.children.cycles-pp.rwsem_spin_on_owner
      0.17 ±  2%      -0.1        0.07 ±  7%  perf-profile.children.cycles-pp.mas_store_prealloc
      0.58 ±  3%      -0.1        0.47 ±  4%  perf-profile.children.cycles-pp.hrtimer_interrupt
      0.17 ±  2%      -0.1        0.07 ±  6%  perf-profile.children.cycles-pp.vma_complete
      0.49 ±  3%      -0.1        0.39 ±  4%  perf-profile.children.cycles-pp.__hrtimer_run_queues
      0.63 ±  4%      -0.1        0.55 ±  4%  perf-profile.children.cycles-pp.intel_idle_ibrs
      0.44 ±  4%      -0.1        0.36 ±  4%  perf-profile.children.cycles-pp.tick_nohz_handler
      0.39 ±  3%      -0.1        0.32 ±  4%  perf-profile.children.cycles-pp.update_process_times
      0.32            -0.0        0.28        perf-profile.children.cycles-pp.__split_vma
      0.36            -0.0        0.31        perf-profile.children.cycles-pp.vms_gather_munmap_vmas
      0.24 ±  4%      -0.0        0.20 ±  3%  perf-profile.children.cycles-pp.sched_tick
      0.19 ±  7%      -0.0        0.16 ±  2%  perf-profile.children.cycles-pp.task_tick_fair
      0.06 ±  6%      -0.0        0.03 ± 70%  perf-profile.children.cycles-pp.smpboot_thread_fn
      0.12 ±  4%      -0.0        0.10 ±  6%  perf-profile.children.cycles-pp.rcu_do_batch
      0.13 ±  3%      -0.0        0.10 ±  3%  perf-profile.children.cycles-pp.rcu_core
      0.14 ±  2%      -0.0        0.12 ±  4%  perf-profile.children.cycles-pp.handle_softirqs
      0.08 ±  4%      -0.0        0.06 ± 11%  perf-profile.children.cycles-pp.get_jiffies_update
      0.08 ±  5%      -0.0        0.06 ± 11%  perf-profile.children.cycles-pp.tmigr_requires_handle_remote
      0.14 ±  2%      -0.0        0.12 ±  3%  perf-profile.children.cycles-pp.kmem_cache_free
      0.07 ±  7%      -0.0        0.05        perf-profile.children.cycles-pp.kthread
      0.07 ±  7%      -0.0        0.05        perf-profile.children.cycles-pp.ret_from_fork
      0.07 ±  7%      -0.0        0.05        perf-profile.children.cycles-pp.ret_from_fork_asm
      0.10 ±  7%      -0.0        0.08 ±  4%  perf-profile.children.cycles-pp.update_cfs_group
      0.06            -0.0        0.05        perf-profile.children.cycles-pp.__slab_free
      0.05            +0.0        0.07 ±  5%  perf-profile.children.cycles-pp.commit_merge
      0.06 ±  9%      +0.0        0.08 ±  4%  perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
      0.06 ±  6%      +0.0        0.09 ±  4%  perf-profile.children.cycles-pp.vma_expand
      0.08 ±  4%      +0.0        0.11 ±  5%  perf-profile.children.cycles-pp.up_write
      0.06 ±  6%      +0.0        0.09 ±  4%  perf-profile.children.cycles-pp.syscall_return_via_sysret
      0.05 ±  7%      +0.0        0.08 ±  5%  perf-profile.children.cycles-pp.anon_vma_clone
      0.07 ±  5%      +0.0        0.11 ±  4%  perf-profile.children.cycles-pp.vma_merge_new_range
      0.06 ±  9%      +0.0        0.09        perf-profile.children.cycles-pp.kmem_cache_alloc_noprof
      0.08 ±  5%      +0.0        0.12 ±  3%  perf-profile.children.cycles-pp.vms_clear_ptes
      0.00            +0.1        0.05        perf-profile.children.cycles-pp.unlink_anon_vmas
      0.00            +0.1        0.05 ±  7%  perf-profile.children.cycles-pp.entry_SYSCALL_64
      0.11 ±  4%      +0.1        0.17 ±  2%  perf-profile.children.cycles-pp.do_brk_flags
      0.00            +0.1        0.06 ±  6%  perf-profile.children.cycles-pp.__memcg_slab_post_alloc_hook
      0.00            +0.1        0.06 ±  6%  perf-profile.children.cycles-pp.free_pgtables
      0.00            +0.1        0.06        perf-profile.children.cycles-pp.vm_area_dup
      0.17 ±  2%      +0.1        0.23 ±  2%  perf-profile.children.cycles-pp.vms_complete_munmap_vmas
      0.00            +0.1        0.07 ±  7%  perf-profile.children.cycles-pp.mas_wr_node_store
      0.00            +0.1        0.12 ±  3%  perf-profile.children.cycles-pp.poll_idle
      0.46 ±  4%      +0.1        0.60 ±  3%  perf-profile.children.cycles-pp.intel_idle
     97.85            +0.2       98.02        perf-profile.children.cycles-pp.__do_sys_brk
     97.90            +0.2       98.08        perf-profile.children.cycles-pp.do_syscall_64
     96.68            +0.2       96.86        perf-profile.children.cycles-pp.rwsem_optimistic_spin
     97.94            +0.2       98.12        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     96.79            +0.2       96.99        perf-profile.children.cycles-pp.rwsem_down_write_slowpath
     96.82            +0.2       97.04        perf-profile.children.cycles-pp.down_write_killable
     95.71            +0.2       95.94        perf-profile.children.cycles-pp.osq_lock
     98.06            +0.3       98.32        perf-profile.children.cycles-pp.brk
      0.54 ±  4%      -0.4        0.15 ±  3%  perf-profile.self.cycles-pp.intel_idle_irq
      0.82            -0.1        0.71        perf-profile.self.cycles-pp.rwsem_spin_on_owner
      0.63 ±  4%      -0.1        0.55 ±  4%  perf-profile.self.cycles-pp.intel_idle_ibrs
      0.08 ±  4%      -0.0        0.06 ± 11%  perf-profile.self.cycles-pp.get_jiffies_update
      0.10 ±  7%      -0.0        0.08 ±  4%  perf-profile.self.cycles-pp.update_cfs_group
      0.06            -0.0        0.05        perf-profile.self.cycles-pp.ktime_get_update_offsets_now
      0.06 ±  9%      +0.0        0.08 ±  4%  perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
      0.06            +0.0        0.09 ±  6%  perf-profile.self.cycles-pp.syscall_return_via_sysret
      0.00            +0.1        0.05        perf-profile.self.cycles-pp.entry_SYSCALL_64
      0.00            +0.1        0.05        perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
      0.13 ±  3%      +0.1        0.18 ±  2%  perf-profile.self.cycles-pp.rwsem_optimistic_spin
      0.00            +0.1        0.06 ±  6%  perf-profile.self.cycles-pp.up_write
      0.00            +0.1        0.12 ±  4%  perf-profile.self.cycles-pp.poll_idle
      0.46 ±  4%      +0.1        0.60 ±  3%  perf-profile.self.cycles-pp.intel_idle
     95.11            +0.3       95.44        perf-profile.self.cycles-pp.osq_lock





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux