Hello, kernel test robot noticed a 12.6% improvement of will-it-scale.per_process_ops on: commit: c0a242394cb980bd00e1f61dc8aacb453d2bbe6a ("mm, page_alloc: scale the number of pages that are batch allocated") https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master testcase: will-it-scale test machine: 104 threads 2 sockets (Skylake) with 192G memory parameters: nr_task: 50% mode: process test: page_fault2 cpufreq_governor: performance Details are as below: --------------------------------------------------------------------------------------------------> The kernel config and materials to reproduce are available at: https://download.01.org/0day-ci/archive/20231120/202311201629.b861c327-oliver.sang@xxxxxxxxx ========================================================================================= compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: gcc-12/performance/x86_64-rhel-8.3/process/50%/debian-11.1-x86_64-20220510.cgz/lkp-skl-fpga01/page_fault2/will-it-scale commit: 52166607ec ("mm: restrict the pcp batch scale factor to avoid too long latency") c0a242394c ("mm, page_alloc: scale the number of pages that are batch allocated") 52166607ecc98039 c0a242394cb980bd00e1f61dc8a ---------------- --------------------------- %stddev %change %stddev \ | \ 4.90 +0.6 5.49 mpstat.cpu.all.usr% 1367 ± 6% +72.8% 2362 ± 4% perf-c2c.HITM.local 8592059 +12.6% 9677986 will-it-scale.52.processes 165231 +12.6% 186114 will-it-scale.per_process_ops 8592059 +12.6% 9677986 will-it-scale.workload 2592 ± 19% +587.0% 17809 ± 97% numa-meminfo.node0.Active(anon) 3494860 ± 2% -22.6% 2703947 numa-meminfo.node0.AnonPages.max 3538966 ± 2% -24.9% 2657708 ± 7% numa-meminfo.node1.AnonPages.max 9310 ± 3% +7.6% 10019 ± 5% numa-meminfo.node1.KernelStack 1.295e+09 +12.8% 1.46e+09 numa-numastat.node0.local_node 1.294e+09 +12.8% 1.46e+09 numa-numastat.node0.numa_hit 1.31e+09 +12.0% 1.467e+09 numa-numastat.node1.local_node 1.309e+09 +12.0% 1.466e+09 numa-numastat.node1.numa_hit 213394 ± 50% +373.5% 1010435 ± 33% sched_debug.cfs_rq:/.avg_vruntime.min 1932637 ± 4% -32.0% 1313931 ± 8% sched_debug.cfs_rq:/.avg_vruntime.stddev 213394 ± 50% +373.5% 1010435 ± 33% sched_debug.cfs_rq:/.min_vruntime.min 1932637 ± 4% -32.0% 1313931 ± 8% sched_debug.cfs_rq:/.min_vruntime.stddev 0.08 +12.5% 0.09 turbostat.IPC 63.77 -45.2 18.60 ± 22% turbostat.PKG_% 353.10 +2.9% 363.42 turbostat.PkgWatt 68.28 +11.4% 76.03 turbostat.RAMWatt 833540 +5.6% 880188 proc-vmstat.nr_anon_pages 2.603e+09 +12.4% 2.925e+09 proc-vmstat.numa_hit 2.605e+09 +12.4% 2.927e+09 proc-vmstat.numa_local 2.599e+09 +12.4% 2.92e+09 proc-vmstat.pgalloc_normal 2.591e+09 +12.4% 2.911e+09 proc-vmstat.pgfault 2.599e+09 +12.4% 2.92e+09 proc-vmstat.pgfree 648.18 ± 19% +586.7% 4450 ± 97% numa-vmstat.node0.nr_active_anon 648.18 ± 19% +586.7% 4450 ± 97% numa-vmstat.node0.nr_zone_active_anon 1.294e+09 +12.8% 1.46e+09 numa-vmstat.node0.numa_hit 1.295e+09 +12.8% 1.46e+09 numa-vmstat.node0.numa_local 9310 ± 3% +7.6% 10021 ± 5% numa-vmstat.node1.nr_kernel_stack 1.309e+09 +12.0% 1.466e+09 numa-vmstat.node1.numa_hit 1.31e+09 +12.0% 1.467e+09 numa-vmstat.node1.numa_local 0.01 ± 80% -93.5% 0.00 ±223% perf-sched.sch_delay.avg.ms.io_schedule.folio_wait_bit_common.filemap_fault.__do_fault 0.01 ± 9% -100.0% 0.00 perf-sched.sch_delay.avg.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm 0.04 ± 9% -46.3% 0.02 ± 73% perf-sched.sch_delay.max.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm 0.03 ±107% -97.0% 0.00 ±223% perf-sched.sch_delay.max.ms.io_schedule.folio_wait_bit_common.filemap_fault.__do_fault 0.02 ± 27% -100.0% 0.00 perf-sched.sch_delay.max.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm 0.03 ± 7% -15.0% 0.02 ± 10% perf-sched.sch_delay.max.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait 0.94 ± 16% -51.9% 0.45 ± 22% perf-sched.sch_delay.max.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm 98.83 ± 11% -40.5% 58.83 ± 11% perf-sched.wait_and_delay.count.__cond_resched.shmem_get_folio_gfp.shmem_fault.__do_fault.do_cow_fault 232.00 ± 10% +48.4% 344.33 ± 4% perf-sched.wait_and_delay.count.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_exc_page_fault 39.50 ± 54% -87.3% 5.03 perf-sched.wait_and_delay.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread 2.99 ± 15% -100.0% 0.00 perf-sched.wait_time.avg.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm 4.81 ± 7% -100.0% 0.00 perf-sched.wait_time.max.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm 33.32 ± 69% -85.0% 5.01 perf-sched.wait_time.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread 16.82 +1.6% 17.09 perf-stat.i.MPKI 8.6e+09 +11.3% 9.573e+09 perf-stat.i.branch-instructions 39148476 +5.6% 41324228 perf-stat.i.branch-misses 81.02 -3.1 77.94 perf-stat.i.cache-miss-rate% 7.134e+08 +13.5% 8.096e+08 perf-stat.i.cache-misses 8.802e+08 +17.7% 1.036e+09 perf-stat.i.cache-references 1813 +1.2% 1834 perf-stat.i.context-switches 3.43 -9.9% 3.09 perf-stat.i.cpi 204.33 -10.7% 182.42 ± 2% perf-stat.i.cycles-between-cache-misses 10135544 +11.8% 11330409 ± 2% perf-stat.i.dTLB-load-misses 1.06e+10 +11.5% 1.182e+10 perf-stat.i.dTLB-loads 70683663 +12.6% 79603765 perf-stat.i.dTLB-store-misses 6.001e+09 +12.8% 6.766e+09 perf-stat.i.dTLB-stores 9753929 +12.9% 11015762 perf-stat.i.iTLB-load-misses 4.24e+10 +11.5% 4.728e+10 perf-stat.i.instructions 4377 -1.5% 4312 perf-stat.i.instructions-per-iTLB-miss 0.29 +11.5% 0.33 perf-stat.i.ipc 0.34 ± 23% -48.0% 0.18 ± 11% perf-stat.i.major-faults 1343 +17.4% 1577 perf-stat.i.metric.K/sec 253.10 +11.9% 283.16 perf-stat.i.metric.M/sec 8585112 +12.0% 9619126 perf-stat.i.minor-faults 0.32 ± 27% +0.3 0.60 ± 53% perf-stat.i.node-load-miss-rate% 694018 +17.3% 813810 ± 3% perf-stat.i.node-load-misses 2.451e+08 +3.6% 2.539e+08 ± 2% perf-stat.i.node-loads 538019 +14.0% 613240 perf-stat.i.node-store-misses 49463410 +25.2% 61905404 perf-stat.i.node-stores 8585112 +12.0% 9619126 perf-stat.i.page-faults 16.83 +1.7% 17.12 perf-stat.overall.MPKI 0.46 -0.0 0.43 perf-stat.overall.branch-miss-rate% 81.06 -2.9 78.18 perf-stat.overall.cache-miss-rate% 3.42 -10.5% 3.07 perf-stat.overall.cpi 203.46 -12.0% 179.07 perf-stat.overall.cycles-between-cache-misses 4347 -1.3% 4291 perf-stat.overall.instructions-per-iTLB-miss 0.29 +11.7% 0.33 perf-stat.overall.ipc 0.28 +0.0 0.32 perf-stat.overall.node-load-miss-rate% 1.08 -0.1 0.98 ± 2% perf-stat.overall.node-store-miss-rate% 8.572e+09 +11.3% 9.542e+09 perf-stat.ps.branch-instructions 39013363 +5.6% 41189792 perf-stat.ps.branch-misses 7.111e+08 +13.5% 8.07e+08 perf-stat.ps.cache-misses 8.773e+08 +17.7% 1.032e+09 perf-stat.ps.cache-references 1805 +1.2% 1826 perf-stat.ps.context-switches 10101169 +11.8% 11293042 ± 2% perf-stat.ps.dTLB-load-misses 1.056e+10 +11.6% 1.179e+10 perf-stat.ps.dTLB-loads 70446051 +12.6% 79343784 perf-stat.ps.dTLB-store-misses 5.981e+09 +12.8% 6.744e+09 perf-stat.ps.dTLB-stores 9719620 +13.0% 10983217 perf-stat.ps.iTLB-load-misses 4.225e+10 +11.5% 4.713e+10 perf-stat.ps.instructions 0.34 ± 22% -48.1% 0.18 ± 11% perf-stat.ps.major-faults 8556237 +12.1% 9587784 perf-stat.ps.minor-faults 691779 +17.3% 811254 ± 3% perf-stat.ps.node-load-misses 2.442e+08 +3.6% 2.531e+08 ± 2% perf-stat.ps.node-loads 536237 +14.0% 611234 perf-stat.ps.node-store-misses 49302195 +25.2% 61706509 perf-stat.ps.node-stores 8556237 +12.1% 9587784 perf-stat.ps.page-faults 1.277e+13 +11.9% 1.43e+13 perf-stat.total.instructions 23.92 -10.1 13.79 perf-profile.calltrace.cycles-pp.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap.__x64_sys_munmap 23.92 -10.1 13.80 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__munmap 23.92 -10.1 13.80 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap 23.92 -10.1 13.80 perf-profile.calltrace.cycles-pp.__munmap 23.92 -10.1 13.80 perf-profile.calltrace.cycles-pp.do_vmi_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe 23.92 -10.1 13.80 perf-profile.calltrace.cycles-pp.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64 23.92 -10.1 13.80 perf-profile.calltrace.cycles-pp.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap 23.92 -10.1 13.80 perf-profile.calltrace.cycles-pp.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap 19.93 -9.8 10.12 perf-profile.calltrace.cycles-pp.release_pages.tlb_batch_pages_flush.zap_pte_range.zap_pmd_range.unmap_page_range 20.07 -9.7 10.33 perf-profile.calltrace.cycles-pp.tlb_batch_pages_flush.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas 10.10 -9.3 0.84 ± 6% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.free_pcppages_bulk.free_unref_page_list.release_pages 10.10 -9.3 0.84 ± 6% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.free_pcppages_bulk.free_unref_page_list.release_pages.tlb_batch_pages_flush 21.64 -9.2 12.46 perf-profile.calltrace.cycles-pp.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region 21.66 -9.2 12.48 perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap 21.66 -9.2 12.48 perf-profile.calltrace.cycles-pp.zap_pmd_range.unmap_page_range.unmap_vmas.unmap_region.do_vmi_align_munmap 21.66 -9.2 12.48 perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap 9.06 -7.1 1.99 ± 2% perf-profile.calltrace.cycles-pp.free_pcppages_bulk.free_unref_page_list.release_pages.tlb_batch_pages_flush.zap_pte_range 9.58 -7.0 2.54 ± 2% perf-profile.calltrace.cycles-pp.free_unref_page_list.release_pages.tlb_batch_pages_flush.zap_pte_range.zap_pmd_range 6.28 ± 2% -6.3 0.00 perf-profile.calltrace.cycles-pp.rmqueue_bulk.rmqueue.get_page_from_freelist.__alloc_pages.__folio_alloc 6.67 -3.5 3.16 ± 4% perf-profile.calltrace.cycles-pp.rmqueue.get_page_from_freelist.__alloc_pages.__folio_alloc.vma_alloc_folio 6.90 -3.5 3.40 ± 3% perf-profile.calltrace.cycles-pp.get_page_from_freelist.__alloc_pages.__folio_alloc.vma_alloc_folio.do_cow_fault 7.28 -3.5 3.83 ± 3% perf-profile.calltrace.cycles-pp.__alloc_pages.__folio_alloc.vma_alloc_folio.do_cow_fault.do_fault 7.34 -3.4 3.90 ± 3% perf-profile.calltrace.cycles-pp.__folio_alloc.vma_alloc_folio.do_cow_fault.do_fault.__handle_mm_fault 7.81 -3.4 4.41 ± 3% perf-profile.calltrace.cycles-pp.vma_alloc_folio.do_cow_fault.do_fault.__handle_mm_fault.handle_mm_fault 9.46 -2.9 6.54 ± 2% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.release_pages.tlb_batch_pages_flush.zap_pte_range 9.46 -2.9 6.54 ± 2% perf-profile.calltrace.cycles-pp.folio_lruvec_lock_irqsave.release_pages.tlb_batch_pages_flush.zap_pte_range.zap_pmd_range 9.44 -2.1 7.34 ± 2% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.release_pages.tlb_batch_pages_flush 13.41 -2.0 11.42 perf-profile.calltrace.cycles-pp.copy_page.do_cow_fault.do_fault.__handle_mm_fault.handle_mm_fault 2.25 -1.0 1.28 ± 3% perf-profile.calltrace.cycles-pp.release_pages.tlb_batch_pages_flush.tlb_finish_mmu.unmap_region.do_vmi_align_munmap 2.26 -1.0 1.30 ± 2% perf-profile.calltrace.cycles-pp.tlb_batch_pages_flush.tlb_finish_mmu.unmap_region.do_vmi_align_munmap.do_vmi_munmap 2.26 -1.0 1.30 ± 2% perf-profile.calltrace.cycles-pp.tlb_finish_mmu.unmap_region.do_vmi_align_munmap.do_vmi_munmap.__vm_munmap 4.23 -0.8 3.47 perf-profile.calltrace.cycles-pp._raw_spin_lock.__pte_offset_map_lock.finish_fault.do_cow_fault.do_fault 4.35 -0.7 3.60 perf-profile.calltrace.cycles-pp.__pte_offset_map_lock.finish_fault.do_cow_fault.do_fault.__handle_mm_fault 1.05 +0.0 1.09 perf-profile.calltrace.cycles-pp.shmem_get_folio_gfp.shmem_fault.__do_fault.do_cow_fault.do_fault 0.67 +0.1 0.74 ± 2% perf-profile.calltrace.cycles-pp.lock_vma_under_rcu.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase 1.25 +0.1 1.33 perf-profile.calltrace.cycles-pp.shmem_fault.__do_fault.do_cow_fault.do_fault.__handle_mm_fault 1.34 +0.1 1.44 perf-profile.calltrace.cycles-pp.__do_fault.do_cow_fault.do_fault.__handle_mm_fault.handle_mm_fault 0.61 ± 7% +0.1 0.72 ± 2% perf-profile.calltrace.cycles-pp.__mod_lruvec_page_state.folio_add_new_anon_rmap.set_pte_range.finish_fault.do_cow_fault 1.06 +0.1 1.18 perf-profile.calltrace.cycles-pp.sync_regs.asm_exc_page_fault.testcase 0.75 ± 6% +0.1 0.88 ± 2% perf-profile.calltrace.cycles-pp.folio_add_new_anon_rmap.set_pte_range.finish_fault.do_cow_fault.do_fault 0.82 +0.2 1.00 ± 2% perf-profile.calltrace.cycles-pp.lru_add_fn.folio_batch_move_lru.folio_add_lru_vma.set_pte_range.finish_fault 2.72 +0.3 3.02 perf-profile.calltrace.cycles-pp.error_entry.testcase 2.52 ± 2% +0.3 2.83 perf-profile.calltrace.cycles-pp.irqentry_exit_to_user_mode.asm_exc_page_fault.testcase 2.77 +0.3 3.10 perf-profile.calltrace.cycles-pp.__irqentry_text_end.testcase 0.72 ± 2% +0.3 1.06 perf-profile.calltrace.cycles-pp.__free_one_page.free_pcppages_bulk.free_unref_page_list.release_pages.tlb_batch_pages_flush 0.75 +0.4 1.12 ± 2% perf-profile.calltrace.cycles-pp._compound_head.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas 0.17 ±141% +0.4 0.58 perf-profile.calltrace.cycles-pp.page_remove_rmap.zap_pte_range.zap_pmd_range.unmap_page_range.unmap_vmas 0.00 +0.5 0.54 perf-profile.calltrace.cycles-pp.__perf_sw_event.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase 0.00 +0.8 0.82 ± 3% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.release_pages.tlb_batch_pages_flush.tlb_finish_mmu 0.00 +0.8 0.82 ± 3% perf-profile.calltrace.cycles-pp.folio_lruvec_lock_irqsave.release_pages.tlb_batch_pages_flush.tlb_finish_mmu.unmap_region 0.00 +0.8 0.85 ± 3% perf-profile.calltrace.cycles-pp.__list_del_entry_valid_or_report.rmqueue_bulk.__rmqueue_pcplist.rmqueue.get_page_from_freelist 0.00 +1.6 1.61 ± 5% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.rmqueue_bulk.__rmqueue_pcplist.rmqueue 0.00 +1.6 1.62 ± 6% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.rmqueue_bulk.__rmqueue_pcplist.rmqueue.get_page_from_freelist 0.00 +2.6 2.56 ± 4% perf-profile.calltrace.cycles-pp.rmqueue_bulk.__rmqueue_pcplist.rmqueue.get_page_from_freelist.__alloc_pages 0.00 +2.9 2.90 ± 4% perf-profile.calltrace.cycles-pp.__rmqueue_pcplist.rmqueue.get_page_from_freelist.__alloc_pages.__folio_alloc 32.25 +7.6 39.90 perf-profile.calltrace.cycles-pp.do_cow_fault.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault 32.34 +7.7 39.99 perf-profile.calltrace.cycles-pp.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault 32.92 +7.7 40.65 perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault 33.76 +7.8 41.56 perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase 35.34 +8.0 43.29 perf-profile.calltrace.cycles-pp.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.testcase 35.47 +8.0 43.44 perf-profile.calltrace.cycles-pp.exc_page_fault.asm_exc_page_fault.testcase 44.87 +9.1 53.97 perf-profile.calltrace.cycles-pp.asm_exc_page_fault.testcase 48.05 +9.4 57.46 perf-profile.calltrace.cycles-pp.testcase 8.27 ± 2% +12.8 21.10 ± 2% perf-profile.calltrace.cycles-pp.finish_fault.do_cow_fault.do_fault.__handle_mm_fault.handle_mm_fault 1.35 ± 8% +13.0 14.35 ± 3% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru_vma 1.43 ± 7% +13.0 14.43 ± 3% perf-profile.calltrace.cycles-pp.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru_vma.set_pte_range.finish_fault 1.42 ± 7% +13.0 14.42 ± 3% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.folio_lruvec_lock_irqsave.folio_batch_move_lru.folio_add_lru_vma.set_pte_range 2.77 ± 5% +13.4 16.18 ± 3% perf-profile.calltrace.cycles-pp.folio_batch_move_lru.folio_add_lru_vma.set_pte_range.finish_fault.do_cow_fault 2.90 ± 5% +13.4 16.32 ± 3% perf-profile.calltrace.cycles-pp.folio_add_lru_vma.set_pte_range.finish_fault.do_cow_fault.do_fault 3.85 ± 4% +13.6 17.42 ± 3% perf-profile.calltrace.cycles-pp.set_pte_range.finish_fault.do_cow_fault.do_fault.__handle_mm_fault 22.33 -10.8 11.57 perf-profile.children.cycles-pp.release_pages 22.33 -10.7 11.63 perf-profile.children.cycles-pp.tlb_batch_pages_flush 23.93 -10.1 13.80 perf-profile.children.cycles-pp.do_vmi_align_munmap 23.92 -10.1 13.80 perf-profile.children.cycles-pp.__munmap 23.92 -10.1 13.80 perf-profile.children.cycles-pp.__vm_munmap 23.92 -10.1 13.80 perf-profile.children.cycles-pp.__x64_sys_munmap 23.92 -10.1 13.79 perf-profile.children.cycles-pp.unmap_region 23.93 -10.1 13.80 perf-profile.children.cycles-pp.do_vmi_munmap 24.00 -10.1 13.87 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe 23.99 -10.1 13.87 perf-profile.children.cycles-pp.do_syscall_64 21.66 -9.2 12.48 perf-profile.children.cycles-pp.unmap_vmas 21.66 -9.2 12.48 perf-profile.children.cycles-pp.unmap_page_range 21.66 -9.2 12.48 perf-profile.children.cycles-pp.zap_pmd_range 21.66 -9.2 12.48 perf-profile.children.cycles-pp.zap_pte_range 11.00 -8.8 2.24 ± 2% perf-profile.children.cycles-pp.free_pcppages_bulk 11.59 -8.7 2.88 ± 2% perf-profile.children.cycles-pp.free_unref_page_list 6.30 ± 2% -3.7 2.58 ± 4% perf-profile.children.cycles-pp.rmqueue_bulk 6.70 -3.5 3.19 ± 4% perf-profile.children.cycles-pp.rmqueue 6.94 -3.5 3.44 ± 3% perf-profile.children.cycles-pp.get_page_from_freelist 7.37 -3.4 3.92 ± 3% perf-profile.children.cycles-pp.__folio_alloc 7.36 -3.4 3.92 ± 3% perf-profile.children.cycles-pp.__alloc_pages 7.85 -3.4 4.45 ± 2% perf-profile.children.cycles-pp.vma_alloc_folio 13.43 -2.0 11.44 perf-profile.children.cycles-pp.copy_page 25.59 -1.2 24.37 ± 2% perf-profile.children.cycles-pp._raw_spin_lock_irqsave 25.50 -1.2 24.28 ± 2% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath 2.26 -1.0 1.30 ± 2% perf-profile.children.cycles-pp.tlb_finish_mmu 4.26 -0.8 3.50 perf-profile.children.cycles-pp._raw_spin_lock 4.35 -0.7 3.61 perf-profile.children.cycles-pp.__pte_offset_map_lock 1.92 -0.5 1.43 ± 3% perf-profile.children.cycles-pp.__list_del_entry_valid_or_report 0.09 +0.0 0.10 ± 3% perf-profile.children.cycles-pp.put_page 0.08 ± 5% +0.0 0.10 ± 5% perf-profile.children.cycles-pp.pte_offset_map_nolock 0.10 ± 6% +0.0 0.12 ± 4% perf-profile.children.cycles-pp.scheduler_tick 0.09 ± 4% +0.0 0.10 ± 3% perf-profile.children.cycles-pp.get_pfnblock_flags_mask 0.12 ± 3% +0.0 0.14 ± 3% perf-profile.children.cycles-pp._raw_spin_trylock 0.11 ± 4% +0.0 0.13 ± 6% perf-profile.children.cycles-pp.down_read_trylock 0.16 ± 2% +0.0 0.18 ± 2% perf-profile.children.cycles-pp.free_unref_page_prepare 0.14 ± 4% +0.0 0.16 ± 3% perf-profile.children.cycles-pp.tick_sched_timer 0.13 ± 5% +0.0 0.15 ± 3% perf-profile.children.cycles-pp.cgroup_rstat_updated 0.13 ± 5% +0.0 0.15 ± 5% perf-profile.children.cycles-pp.update_process_times 0.13 ± 3% +0.0 0.15 ± 3% perf-profile.children.cycles-pp.__mem_cgroup_uncharge_list 0.13 ± 5% +0.0 0.15 ± 4% perf-profile.children.cycles-pp.tick_sched_handle 0.18 ± 3% +0.0 0.21 ± 4% perf-profile.children.cycles-pp.blk_cgroup_congested 0.36 +0.0 0.39 ± 2% perf-profile.children.cycles-pp.mas_walk 0.14 ± 5% +0.0 0.18 ± 4% perf-profile.children.cycles-pp.handle_pte_fault 0.22 ± 2% +0.0 0.25 ± 4% perf-profile.children.cycles-pp.__folio_throttle_swaprate 1.06 +0.0 1.11 perf-profile.children.cycles-pp.shmem_get_folio_gfp 0.37 ± 2% +0.0 0.41 perf-profile.children.cycles-pp.__mod_node_page_state 0.46 +0.0 0.51 ± 2% perf-profile.children.cycles-pp.__mod_lruvec_state 0.67 +0.1 0.74 ± 2% perf-profile.children.cycles-pp.lock_vma_under_rcu 0.13 ± 2% +0.1 0.20 ± 2% perf-profile.children.cycles-pp.free_pages_and_swap_cache 0.12 ± 4% +0.1 0.20 ± 3% perf-profile.children.cycles-pp.free_swap_cache 0.48 ± 4% +0.1 0.55 ± 3% perf-profile.children.cycles-pp.__mod_memcg_lruvec_state 1.25 +0.1 1.34 perf-profile.children.cycles-pp.shmem_fault 0.49 +0.1 0.58 perf-profile.children.cycles-pp.page_remove_rmap 1.35 +0.1 1.44 perf-profile.children.cycles-pp.__do_fault 0.73 ± 3% +0.1 0.82 perf-profile.children.cycles-pp.___perf_sw_event 0.93 ± 2% +0.1 1.04 perf-profile.children.cycles-pp.__perf_sw_event 1.10 +0.1 1.22 perf-profile.children.cycles-pp.sync_regs 0.22 ± 5% +0.1 0.35 ± 5% perf-profile.children.cycles-pp.__list_add_valid_or_report 0.75 ± 6% +0.1 0.88 ± 2% perf-profile.children.cycles-pp.folio_add_new_anon_rmap 0.85 ± 4% +0.2 1.00 ± 2% perf-profile.children.cycles-pp.__mod_lruvec_page_state 1.44 +0.2 1.61 perf-profile.children.cycles-pp.native_irq_return_iret 0.84 +0.2 1.02 ± 2% perf-profile.children.cycles-pp.lru_add_fn 2.74 +0.3 3.04 perf-profile.children.cycles-pp.error_entry 2.56 ± 2% +0.3 2.87 perf-profile.children.cycles-pp.irqentry_exit_to_user_mode 2.77 +0.3 3.10 perf-profile.children.cycles-pp.__irqentry_text_end 0.79 +0.4 1.17 perf-profile.children.cycles-pp._compound_head 0.81 ± 2% +0.4 1.21 perf-profile.children.cycles-pp.__free_one_page 0.00 +2.9 2.92 ± 4% perf-profile.children.cycles-pp.__rmqueue_pcplist 32.30 +7.6 39.94 perf-profile.children.cycles-pp.do_cow_fault 32.34 +7.6 39.99 perf-profile.children.cycles-pp.do_fault 32.93 +7.7 40.67 perf-profile.children.cycles-pp.__handle_mm_fault 33.81 +7.8 41.61 perf-profile.children.cycles-pp.handle_mm_fault 35.36 +8.0 43.32 perf-profile.children.cycles-pp.do_user_addr_fault 35.49 +8.0 43.46 perf-profile.children.cycles-pp.exc_page_fault 42.09 +8.8 50.86 perf-profile.children.cycles-pp.asm_exc_page_fault 49.44 +9.6 59.04 perf-profile.children.cycles-pp.testcase 11.03 ± 2% +10.8 21.81 ± 2% perf-profile.children.cycles-pp.folio_lruvec_lock_irqsave 8.29 ± 2% +12.8 21.12 ± 2% perf-profile.children.cycles-pp.finish_fault 2.78 ± 5% +13.4 16.21 ± 3% perf-profile.children.cycles-pp.folio_batch_move_lru 2.90 ± 5% +13.4 16.33 ± 3% perf-profile.children.cycles-pp.folio_add_lru_vma 3.86 ± 4% +13.6 17.43 ± 3% perf-profile.children.cycles-pp.set_pte_range 13.36 -2.0 11.38 perf-profile.self.cycles-pp.copy_page 25.49 -1.2 24.28 ± 2% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath 4.24 -0.8 3.48 perf-profile.self.cycles-pp._raw_spin_lock 1.92 -0.5 1.42 ± 3% perf-profile.self.cycles-pp.__list_del_entry_valid_or_report 0.22 ± 4% -0.1 0.13 ± 3% perf-profile.self.cycles-pp.rmqueue 0.10 ± 7% -0.0 0.08 ± 4% perf-profile.self.cycles-pp.rmqueue_bulk 0.11 +0.0 0.12 ± 3% perf-profile.self.cycles-pp.uncharge_folio 0.09 +0.0 0.10 ± 3% perf-profile.self.cycles-pp.put_page 0.09 ± 4% +0.0 0.10 ± 4% perf-profile.self.cycles-pp.exc_page_fault 0.07 ± 5% +0.0 0.08 ± 4% perf-profile.self.cycles-pp.exit_to_user_mode_prepare 0.13 +0.0 0.14 ± 3% perf-profile.self.cycles-pp.asm_exc_page_fault 0.12 ± 4% +0.0 0.14 ± 3% perf-profile.self.cycles-pp._raw_spin_trylock 0.08 ± 5% +0.0 0.10 ± 3% perf-profile.self.cycles-pp.get_pfnblock_flags_mask 0.11 ± 4% +0.0 0.13 ± 6% perf-profile.self.cycles-pp.down_read_trylock 0.22 ± 2% +0.0 0.24 ± 2% perf-profile.self.cycles-pp.get_page_from_freelist 0.14 ± 3% +0.0 0.15 ± 3% perf-profile.self.cycles-pp.folio_add_new_anon_rmap 0.14 ± 2% +0.0 0.16 ± 3% perf-profile.self.cycles-pp.blk_cgroup_congested 0.11 ± 6% +0.0 0.13 ± 2% perf-profile.self.cycles-pp.cgroup_rstat_updated 0.06 ± 11% +0.0 0.08 ± 5% perf-profile.self.cycles-pp.handle_pte_fault 0.12 ± 4% +0.0 0.14 ± 4% perf-profile.self.cycles-pp.mas_walk 0.18 ± 2% +0.0 0.20 ± 2% perf-profile.self.cycles-pp.free_unref_page_list 0.28 ± 3% +0.0 0.30 ± 2% perf-profile.self.cycles-pp.vma_alloc_folio 0.27 ± 2% +0.0 0.30 perf-profile.self.cycles-pp.page_remove_rmap 0.20 +0.0 0.24 ± 3% perf-profile.self.cycles-pp.shmem_fault 0.24 ± 3% +0.0 0.27 perf-profile.self.cycles-pp.shmem_get_folio_gfp 0.34 +0.0 0.38 ± 2% perf-profile.self.cycles-pp.__alloc_pages 0.35 ± 3% +0.0 0.40 perf-profile.self.cycles-pp.__mod_node_page_state 0.44 ± 2% +0.1 0.49 perf-profile.self.cycles-pp.__handle_mm_fault 0.38 +0.1 0.44 ± 3% perf-profile.self.cycles-pp.lru_add_fn 0.40 ± 4% +0.1 0.46 ± 4% perf-profile.self.cycles-pp.__mod_memcg_lruvec_state 0.00 +0.1 0.07 ± 7% perf-profile.self.cycles-pp.__rmqueue_pcplist 0.12 ± 3% +0.1 0.19 ± 3% perf-profile.self.cycles-pp.free_swap_cache 0.65 ± 2% +0.1 0.72 perf-profile.self.cycles-pp.___perf_sw_event 0.29 ± 12% +0.1 0.37 ± 6% perf-profile.self.cycles-pp.__mod_lruvec_page_state 0.29 +0.1 0.39 ± 2% perf-profile.self.cycles-pp.zap_pte_range 1.10 +0.1 1.22 perf-profile.self.cycles-pp.sync_regs 0.21 ± 4% +0.1 0.33 ± 6% perf-profile.self.cycles-pp.__list_add_valid_or_report 1.44 +0.2 1.60 perf-profile.self.cycles-pp.native_irq_return_iret 0.38 ± 7% +0.2 0.55 ± 2% perf-profile.self.cycles-pp.folio_batch_move_lru 2.73 +0.3 3.03 perf-profile.self.cycles-pp.error_entry 2.47 +0.3 2.78 perf-profile.self.cycles-pp.irqentry_exit_to_user_mode 2.77 +0.3 3.10 perf-profile.self.cycles-pp.__irqentry_text_end 0.78 +0.4 1.15 perf-profile.self.cycles-pp._compound_head 3.17 +0.4 3.54 perf-profile.self.cycles-pp.testcase 0.75 ± 2% +0.4 1.15 perf-profile.self.cycles-pp.__free_one_page Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki