Hello, kernel test robot noticed a 128.6% improvement of vm-scalability.throughput on: commit: c2a967f6ab0ec896648c0497d3dc15d8f136b148 ("mm/hugetlb_vmemmap: don't synchronize_rcu() without HVO") https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master testcase: vm-scalability test machine: 128 threads 2 sockets Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz (Ice Lake) with 256G memory parameters: runtime: 300s size: 8T test: anon-w-seq-hugetlb cpufreq_governor: performance Details are as below: --------------------------------------------------------------------------------------------------> The kernel config and materials to reproduce are available at: https://download.01.org/0day-ci/archive/20240908/202409082259.783d11c3-oliver.sang@xxxxxxxxx ========================================================================================= compiler/cpufreq_governor/kconfig/rootfs/runtime/size/tbox_group/test/testcase: gcc-12/performance/x86_64-rhel-8.3/debian-12-x86_64-20240206.cgz/300s/8T/lkp-icl-2sp2/anon-w-seq-hugetlb/vm-scalability commit: 9eace7e8e6 ("shmem_quota: build the object file conditionally to the config option") c2a967f6ab ("mm/hugetlb_vmemmap: don't synchronize_rcu() without HVO") 9eace7e8e60c3ac8 c2a967f6ab0ec896648c0497d3d ---------------- --------------------------- %stddev %change %stddev \ | \ 31940 -74.5% 8147 ± 2% uptime.idle 2.578e+10 -91.7% 2.135e+09 ± 9% cpuidle..time 16240610 ± 2% -91.1% 1448729 ± 3% cpuidle..usage 1059015 ± 23% +143.4% 2577289 ± 20% numa-numastat.node0.local_node 23613 ± 31% +375.6% 112309 ± 24% numa-numastat.node0.numa_foreign 1844937 ± 3% +98.8% 3667008 ± 2% numa-numastat.node0.numa_hit 560064 ± 11% +33.0% 744848 ± 9% numa-numastat.node1.local_node 23613 ± 31% +375.9% 112379 ± 24% numa-numastat.node1.numa_miss 65.77 -91.1% 5.85 ± 8% vmstat.cpu.id 16.20 ± 2% +176.7% 44.82 vmstat.cpu.us 45.10 ± 2% +178.5% 125.61 vmstat.procs.r 11694 -59.8% 4695 ± 3% vmstat.system.cs 107920 ± 3% +41.4% 152607 vmstat.system.in 65.42 -60.1 5.30 ± 10% mpstat.cpu.all.idle% 0.35 ± 2% +0.4 0.78 ± 3% mpstat.cpu.all.irq% 0.07 +0.0 0.08 ± 4% mpstat.cpu.all.soft% 17.82 ± 2% +30.9 48.71 mpstat.cpu.all.sys% 16.34 ± 2% +28.8 45.13 mpstat.cpu.all.usr% 145.50 ± 56% -90.3% 14.17 ± 51% mpstat.max_utilization.seconds 82327 ± 2% +134.2% 192839 ± 2% vm-scalability.median 10.16 ± 7% -7.9 2.31 ± 46% vm-scalability.median_stddev% 11.28 ± 7% -7.6 3.71 ± 72% vm-scalability.stddev% 10919990 ± 2% +128.6% 24965297 ± 2% vm-scalability.throughput 48989 ± 6% +789.8% 435923 ± 3% vm-scalability.time.involuntary_context_switches 1514783 ± 2% +133.7% 3540338 ± 2% vm-scalability.time.minor_page_faults 4345 ± 2% +174.6% 11931 vm-scalability.time.percent_of_cpu_this_job_got 6920 ± 3% +171.3% 18777 vm-scalability.time.system_time 6381 ± 3% +174.6% 17520 vm-scalability.time.user_time 1364249 ± 2% -97.9% 28851 ± 3% vm-scalability.time.voluntary_context_switches 3.115e+09 ± 2% +134.1% 7.294e+09 ± 2% vm-scalability.workload 12586523 -86.1% 1749922 ± 17% numa-vmstat.node0.nr_free_pages 210.72 ± 52% -95.6% 9.23 ± 98% numa-vmstat.node0.nr_inactive_file 40699 ± 5% +12.2% 45671 ± 4% numa-vmstat.node0.nr_slab_unreclaimable 210.72 ± 52% -95.6% 9.23 ± 98% numa-vmstat.node0.nr_zone_inactive_file 23613 ± 31% +375.6% 112309 ± 24% numa-vmstat.node0.numa_foreign 1844407 ± 3% +98.8% 3666145 ± 2% numa-vmstat.node0.numa_hit 1058484 ± 23% +143.4% 2576426 ± 20% numa-vmstat.node0.numa_local 32680 ± 13% +104.0% 66656 ± 16% numa-vmstat.node1.nr_active_anon 7227 ± 35% +104.6% 14787 ± 18% numa-vmstat.node1.nr_mapped 28822 ± 6% -12.1% 25343 ± 7% numa-vmstat.node1.nr_slab_unreclaimable 32680 ± 13% +104.0% 66656 ± 16% numa-vmstat.node1.nr_zone_active_anon 558726 ± 11% +33.1% 743806 ± 9% numa-vmstat.node1.numa_local 23613 ± 31% +375.9% 112379 ± 24% numa-vmstat.node1.numa_miss 10828 ± 2% +171.3% 29376 ± 4% numa-meminfo.node0.HugePages_Free 38486 +53.6% 59107 ± 2% numa-meminfo.node0.HugePages_Surp 38486 +53.6% 59107 ± 2% numa-meminfo.node0.HugePages_Total 843.20 ± 52% -95.6% 37.05 ± 97% numa-meminfo.node0.Inactive(file) 50130416 -84.4% 7832175 ± 41% numa-meminfo.node0.MemFree 81554535 +51.9% 1.239e+08 ± 2% numa-meminfo.node0.MemUsed 162789 ± 5% +12.2% 182674 ± 4% numa-meminfo.node0.SUnreclaim 204499 ± 9% +15.7% 236612 ± 8% numa-meminfo.node0.Slab 130875 ± 13% +103.8% 266757 ± 16% numa-meminfo.node1.Active 130833 ± 13% +103.9% 266732 ± 16% numa-meminfo.node1.Active(anon) 346.50 ± 29% +344.5% 1540 ± 33% numa-meminfo.node1.HugePages_Surp 346.50 ± 29% +344.5% 1540 ± 33% numa-meminfo.node1.HugePages_Total 28293 ± 34% +106.5% 58430 ± 18% numa-meminfo.node1.Mapped 4467756 ± 19% +54.5% 6901715 numa-meminfo.node1.MemUsed 115289 ± 6% -12.1% 101363 ± 7% numa-meminfo.node1.SUnreclaim 141692 ± 15% +113.9% 303149 ± 3% meminfo.Active 141525 ± 15% +114.1% 302985 ± 3% meminfo.Active(anon) 1060 -94.9% 54.41 ± 82% meminfo.Buffers 92057649 -24.9% 69146326 meminfo.CommitLimit 1068336 ± 2% +18.2% 1262643 meminfo.Committed_AS 10864 ± 2% +171.5% 29498 meminfo.HugePages_Free 10865 ± 2% +171.5% 29499 meminfo.HugePages_Rsvd 38880 +57.5% 61254 meminfo.HugePages_Surp 38880 +57.5% 61255 meminfo.HugePages_Total 79627804 +57.5% 1.255e+08 meminfo.Hugetlb 1197 -89.8% 122.49 ± 66% meminfo.Inactive(file) 38220 ± 3% +104.6% 78211 ± 11% meminfo.Mapped 1.765e+08 -26.0% 1.307e+08 meminfo.MemAvailable 1.776e+08 -25.8% 1.318e+08 meminfo.MemFree 86118884 +53.2% 1.32e+08 meminfo.Memused 275702 ± 8% +68.7% 465190 meminfo.Shmem 3.609e+08 ± 12% +70.6% 6.157e+08 ± 27% proc-vmstat.compact_daemon_free_scanned 3.609e+08 ± 12% +70.6% 6.157e+08 ± 27% proc-vmstat.compact_free_scanned 1354752 ± 2% +134.1% 3171840 ± 2% proc-vmstat.htlb_buddy_alloc_success 35383 ± 15% +114.2% 75782 ± 3% proc-vmstat.nr_active_anon 4412727 -25.2% 3301489 ± 2% proc-vmstat.nr_dirty_background_threshold 8836244 -25.2% 6611050 ± 2% proc-vmstat.nr_dirty_threshold 843423 +5.6% 890521 proc-vmstat.nr_file_pages 44475421 -25.0% 33346678 ± 2% proc-vmstat.nr_free_pages 196067 +4.1% 204088 proc-vmstat.nr_inactive_anon 299.62 -89.8% 30.58 ± 66% proc-vmstat.nr_inactive_file 9813 ± 3% +103.3% 19952 ± 11% proc-vmstat.nr_mapped 2689 +5.5% 2837 proc-vmstat.nr_page_table_pages 68907 ± 8% +68.8% 116343 proc-vmstat.nr_shmem 69518 +2.1% 71010 proc-vmstat.nr_slab_unreclaimable 35383 ± 15% +114.2% 75782 ± 3% proc-vmstat.nr_zone_active_anon 196068 +4.1% 204089 proc-vmstat.nr_zone_inactive_anon 299.62 -89.8% 30.58 ± 66% proc-vmstat.nr_zone_inactive_file 23613 ± 31% +375.6% 112309 ± 24% proc-vmstat.numa_foreign 29590 ± 34% +88.6% 55811 ± 13% proc-vmstat.numa_hint_faults 8999 ± 32% +84.8% 16627 ± 5% proc-vmstat.numa_hint_faults_local 2469706 +79.8% 4440194 proc-vmstat.numa_hit 1621730 ± 12% +105.0% 3324181 ± 17% proc-vmstat.numa_local 23613 ± 31% +375.9% 112379 ± 24% proc-vmstat.numa_miss 51238 ± 10% +112.4% 108806 ± 4% proc-vmstat.pgactivate 47212 ± 13% +150.5% 118274 ± 2% proc-vmstat.pgalloc_dma 7800626 ± 5% +139.0% 18639703 ± 2% proc-vmstat.pgalloc_dma32 6.871e+08 ± 2% +133.9% 1.607e+09 ± 2% proc-vmstat.pgalloc_normal 2426642 +84.7% 4481472 proc-vmstat.pgfault 6.935e+08 ± 2% +134.4% 1.625e+09 ± 2% proc-vmstat.pgfree 52701 +20.3% 63424 proc-vmstat.pgreuse 2330 ± 13% +123.8% 5216 ± 5% proc-vmstat.unevictable_pgs_culled 1.16 ± 51% -1.1 0.03 ±100% perf-profile.children.cycles-pp.common_startup_64 1.16 ± 51% -1.1 0.03 ±100% perf-profile.children.cycles-pp.cpu_startup_entry 1.16 ± 51% -1.1 0.03 ±100% perf-profile.children.cycles-pp.do_idle 1.14 ± 52% -1.1 0.03 ±100% perf-profile.children.cycles-pp.start_secondary 1.14 ± 51% -1.1 0.03 ±100% perf-profile.children.cycles-pp.cpuidle_idle_call 1.08 ± 51% -1.1 0.03 ±100% perf-profile.children.cycles-pp.cpuidle_enter 1.08 ± 51% -1.1 0.03 ±100% perf-profile.children.cycles-pp.cpuidle_enter_state 1.06 ± 51% -1.0 0.02 ± 99% perf-profile.children.cycles-pp.acpi_idle_enter 1.06 ± 51% -1.0 0.02 ± 99% perf-profile.children.cycles-pp.acpi_safe_halt 2.65 ± 16% -0.9 1.78 ± 3% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt 1.31 ± 12% -0.3 0.97 ± 4% perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt 1.03 ± 8% -0.2 0.86 ± 5% perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt 1.00 ± 8% -0.2 0.84 ± 5% perf-profile.children.cycles-pp.hrtimer_interrupt 0.23 ± 24% -0.1 0.11 ± 8% perf-profile.children.cycles-pp.__irq_exit_rcu 0.22 ± 25% -0.1 0.10 ± 9% perf-profile.children.cycles-pp.handle_softirqs 0.79 ± 6% -0.1 0.68 ± 4% perf-profile.children.cycles-pp.__hrtimer_run_queues 0.74 ± 6% -0.1 0.65 ± 5% perf-profile.children.cycles-pp.tick_nohz_handler 0.67 ± 6% -0.1 0.58 ± 5% perf-profile.children.cycles-pp.update_process_times 0.07 ± 12% -0.0 0.05 ± 8% perf-profile.children.cycles-pp.rcu_sched_clock_irq 0.08 ± 8% +0.0 0.10 perf-profile.children.cycles-pp.task_mm_cid_work 0.10 ± 13% +0.0 0.13 ± 3% perf-profile.children.cycles-pp.irqentry_exit_to_user_mode 0.04 ± 80% +0.1 0.12 ± 19% perf-profile.children.cycles-pp.fast_imageblit 0.04 ± 80% +0.1 0.12 ± 16% perf-profile.children.cycles-pp.drm_fbdev_shmem_defio_imageblit 0.04 ± 80% +0.1 0.12 ± 16% perf-profile.children.cycles-pp.sys_imageblit 0.07 ± 62% +0.1 0.16 ± 19% perf-profile.children.cycles-pp.con_scroll 0.07 ± 62% +0.1 0.16 ± 19% perf-profile.children.cycles-pp.fbcon_scroll 0.07 ± 62% +0.1 0.16 ± 19% perf-profile.children.cycles-pp.lf 0.06 ± 79% +0.1 0.16 ± 19% perf-profile.children.cycles-pp.bit_putcs 0.07 ± 62% +0.1 0.16 ± 18% perf-profile.children.cycles-pp.vt_console_print 0.06 ± 79% +0.1 0.16 ± 18% perf-profile.children.cycles-pp.fbcon_putcs 0.06 ± 79% +0.1 0.16 ± 18% perf-profile.children.cycles-pp.fbcon_redraw 0.06 -0.0 0.02 ± 99% perf-profile.self.cycles-pp.ktime_get_update_offsets_now 0.04 ± 80% +0.1 0.12 ± 19% perf-profile.self.cycles-pp.fast_imageblit 0.67 ± 6% +0.1 0.75 ± 2% perf-profile.self.cycles-pp.folio_zero_user 4.83 +48.9% 7.19 perf-stat.i.MPKI 1.053e+10 ± 2% +143.4% 2.563e+10 ± 2% perf-stat.i.branch-instructions 0.82 ± 2% -0.8 0.05 ± 10% perf-stat.i.branch-miss-rate% 7434133 ± 2% +13.9% 8468122 perf-stat.i.branch-misses 50.36 +41.4 91.77 perf-stat.i.cache-miss-rate% 2.15e+08 +171.8% 5.843e+08 ± 2% perf-stat.i.cache-misses 2.529e+08 ± 2% +150.8% 6.343e+08 ± 2% perf-stat.i.cache-references 12140 -61.4% 4689 ± 3% perf-stat.i.context-switches 3.18 +24.8% 3.97 ± 2% perf-stat.i.cpi 128470 +1.4% 130236 perf-stat.i.cpu-clock 1.128e+11 ± 2% +184.1% 3.204e+11 perf-stat.i.cpu-cycles 731.50 ± 5% -64.5% 259.50 ± 4% perf-stat.i.cpu-migrations 915.11 ± 5% -37.3% 574.09 ± 7% perf-stat.i.cycles-between-cache-misses 3.34e+10 ± 2% +142.0% 8.084e+10 ± 2% perf-stat.i.instructions 0.41 ± 2% -37.4% 0.26 ± 2% perf-stat.i.ipc 0.31 ± 34% +82.3% 0.56 ± 22% perf-stat.i.major-faults 7499 +95.9% 14692 ± 2% perf-stat.i.minor-faults 7500 +95.9% 14692 ± 2% perf-stat.i.page-faults 128470 +1.4% 130236 perf-stat.i.task-clock 6.45 +12.0% 7.23 perf-stat.overall.MPKI 0.07 ± 3% -0.0 0.03 ± 2% perf-stat.overall.branch-miss-rate% 85.15 +6.9 92.02 perf-stat.overall.cache-miss-rate% 3.39 ± 2% +16.7% 3.95 ± 2% perf-stat.overall.cpi 0.30 ± 2% -14.3% 0.25 ± 2% perf-stat.overall.ipc 1.077e+10 ± 2% +134.0% 2.52e+10 ± 2% perf-stat.ps.branch-instructions 2.204e+08 +160.6% 5.745e+08 ± 2% perf-stat.ps.cache-misses 2.589e+08 +141.1% 6.242e+08 ± 2% perf-stat.ps.cache-references 11680 -60.4% 4621 ± 2% perf-stat.ps.context-switches 1.157e+11 ± 2% +171.5% 3.141e+11 perf-stat.ps.cpu-cycles 706.28 ± 5% -64.2% 252.70 ± 4% perf-stat.ps.cpu-migrations 3.416e+10 ± 2% +132.7% 7.948e+10 ± 2% perf-stat.ps.instructions 0.36 ± 29% +57.7% 0.56 ± 22% perf-stat.ps.major-faults 7637 +88.3% 14379 ± 2% perf-stat.ps.minor-faults 7637 +88.3% 14380 ± 2% perf-stat.ps.page-faults 1.05e+13 ± 2% +131.7% 2.432e+13 perf-stat.total.instructions 5752707 ± 8% +241.6% 19652361 sched_debug.cfs_rq:/.avg_vruntime.avg 6235901 ± 8% +224.1% 20207800 sched_debug.cfs_rq:/.avg_vruntime.max 4671237 ± 9% +273.8% 17460516 ± 2% sched_debug.cfs_rq:/.avg_vruntime.min 296257 ± 16% +50.4% 445559 ± 19% sched_debug.cfs_rq:/.avg_vruntime.stddev 0.47 ± 37% +84.8% 0.87 sched_debug.cfs_rq:/.h_nr_running.avg 1.42 ± 16% +43.1% 2.03 ± 12% sched_debug.cfs_rq:/.h_nr_running.max 0.17 ± 57% +266.7% 0.61 ± 25% sched_debug.cfs_rq:/.h_nr_running.min 7787 ±110% +1165.4% 98542 ± 28% sched_debug.cfs_rq:/.left_deadline.avg 996783 ±110% +914.7% 10114565 ± 18% sched_debug.cfs_rq:/.left_deadline.max 87759 ±110% +1007.3% 971722 ± 20% sched_debug.cfs_rq:/.left_deadline.stddev 7787 ±110% +1165.4% 98541 ± 28% sched_debug.cfs_rq:/.left_vruntime.avg 996755 ±110% +914.7% 10114478 ± 18% sched_debug.cfs_rq:/.left_vruntime.max 87756 ±110% +1007.3% 971713 ± 20% sched_debug.cfs_rq:/.left_vruntime.stddev 9220 ± 16% +58.4% 14608 ± 8% sched_debug.cfs_rq:/.load.avg 1335 ± 57% +261.0% 4820 ± 25% sched_debug.cfs_rq:/.load.min 5752707 ± 8% +241.6% 19652362 sched_debug.cfs_rq:/.min_vruntime.avg 6235901 ± 8% +224.1% 20207800 sched_debug.cfs_rq:/.min_vruntime.max 4671237 ± 9% +273.8% 17460531 ± 2% sched_debug.cfs_rq:/.min_vruntime.min 296257 ± 16% +50.4% 445558 ± 19% sched_debug.cfs_rq:/.min_vruntime.stddev 0.47 ± 37% +77.1% 0.83 ± 4% sched_debug.cfs_rq:/.nr_running.avg 0.17 ± 57% +266.7% 0.61 ± 25% sched_debug.cfs_rq:/.nr_running.min 7787 ±110% +1165.4% 98541 ± 28% sched_debug.cfs_rq:/.right_vruntime.avg 996755 ±110% +914.7% 10114489 ± 18% sched_debug.cfs_rq:/.right_vruntime.max 87756 ±110% +1007.3% 971714 ± 20% sched_debug.cfs_rq:/.right_vruntime.stddev 496.10 ± 36% +83.3% 909.20 sched_debug.cfs_rq:/.runnable_avg.avg 1271 ± 11% +58.2% 2010 ± 14% sched_debug.cfs_rq:/.runnable_avg.max 492.02 ± 36% +75.7% 864.59 ± 3% sched_debug.cfs_rq:/.util_avg.avg 1125 ± 11% +35.7% 1527 ± 15% sched_debug.cfs_rq:/.util_avg.max 100.63 ± 47% +449.9% 553.35 ± 8% sched_debug.cfs_rq:/.util_est.avg 814.39 ± 11% +102.6% 1649 ± 10% sched_debug.cfs_rq:/.util_est.max 0.61 ±100% +2045.5% 13.11 ± 41% sched_debug.cfs_rq:/.util_est.min 202.99 ± 33% +76.9% 359.01 ± 9% sched_debug.cfs_rq:/.util_est.stddev 1304244 ± 8% +17.7% 1535515 ± 9% sched_debug.cpu.avg_idle.max 99574 ± 12% +30.7% 130140 ± 10% sched_debug.cpu.avg_idle.stddev 945.30 ± 3% +13.1% 1069 sched_debug.cpu.clock_task.stddev 4198 ± 48% +133.6% 9809 ± 5% sched_debug.cpu.curr->pid.avg 8874 ± 4% +22.9% 10903 sched_debug.cpu.curr->pid.max 673918 ± 6% +17.5% 792166 ± 9% sched_debug.cpu.max_idle_balance_cost.max 21690 ± 31% +111.3% 45830 ± 21% sched_debug.cpu.max_idle_balance_cost.stddev 0.47 ± 38% +85.7% 0.87 sched_debug.cpu.nr_running.avg 1.42 ± 16% +47.1% 2.08 ± 11% sched_debug.cpu.nr_running.max 14507 ± 7% -54.9% 6544 ± 3% sched_debug.cpu.nr_switches.avg 6659 ± 22% -62.9% 2473 ± 4% sched_debug.cpu.nr_switches.min 0.35 ± 48% -98.8% 0.00 ± 84% sched_debug.cpu.nr_uninterruptible.avg 57.48 ± 40% -69.4% 17.58 ± 17% sched_debug.cpu.nr_uninterruptible.max -37.57 -53.0% -17.64 sched_debug.cpu.nr_uninterruptible.min 13.26 ± 14% -59.9% 5.32 ± 4% sched_debug.cpu.nr_uninterruptible.stddev 0.00 ± 51% +1.5e+05% 0.92 ± 28% sched_debug.rt_rq:.rt_time.avg 0.08 ± 51% +1.5e+05% 117.15 ± 28% sched_debug.rt_rq:.rt_time.max 0.01 ± 51% +1.5e+05% 10.31 ± 28% sched_debug.rt_rq:.rt_time.stddev 0.03 ±215% +2879.8% 0.96 ± 49% perf-sched.sch_delay.avg.ms.__cond_resched.__alloc_pages_noprof.alloc_pages_mpol_noprof.folio_alloc_mpol_noprof.vma_alloc_folio_noprof 0.13 ±103% +1189.9% 1.68 ±119% perf-sched.sch_delay.avg.ms.__cond_resched.__tlb_batch_free_encoded_pages.tlb_finish_mmu.exit_mmap.mmput 0.85 ± 86% +231.7% 2.80 ± 54% perf-sched.sch_delay.avg.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity 0.01 ±223% +7452.6% 0.98 ± 52% perf-sched.sch_delay.avg.ms.__cond_resched.dput.__fput.__x64_sys_close.do_syscall_64 0.26 ± 87% +749.2% 2.17 ± 20% perf-sched.sch_delay.avg.ms.__cond_resched.folio_zero_user.hugetlb_no_page.hugetlb_fault.handle_mm_fault 0.01 ±223% +17506.7% 2.20 ± 46% perf-sched.sch_delay.avg.ms.__cond_resched.hugetlb_no_page.hugetlb_fault.handle_mm_fault.do_user_addr_fault 0.03 ±147% +11098.0% 3.71 ± 36% perf-sched.sch_delay.avg.ms.__cond_resched.mutex_lock.hugetlb_fault.handle_mm_fault.do_user_addr_fault 0.10 ±152% +369.6% 0.45 ± 35% perf-sched.sch_delay.avg.ms.__cond_resched.stop_one_cpu.migrate_task_to.task_numa_migrate.isra 0.00 ±223% +29071.4% 0.34 ± 40% perf-sched.sch_delay.avg.ms.__cond_resched.task_work_run.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown] 0.04 ±212% +3041.4% 1.28 ± 77% perf-sched.sch_delay.avg.ms.__cond_resched.zap_pmd_range.isra.0.unmap_page_range 0.02 ± 75% +10356.6% 2.13 ± 25% perf-sched.sch_delay.avg.ms.__x64_sys_pause.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown] 0.14 ± 80% +760.6% 1.22 ± 53% perf-sched.sch_delay.avg.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_64 0.12 ±186% +1544.3% 1.93 ± 53% perf-sched.sch_delay.avg.ms.do_nanosleep.hrtimer_nanosleep.common_nsleep.__x64_sys_clock_nanosleep 0.10 ± 76% +418.1% 0.50 ± 71% perf-sched.sch_delay.avg.ms.do_wait.kernel_wait4.__do_sys_wait4.do_syscall_64 0.02 ± 53% +246.5% 0.08 ± 18% perf-sched.sch_delay.avg.ms.irq_thread.kthread.ret_from_fork.ret_from_fork_asm 0.24 ±112% +242.9% 0.82 ± 42% perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_common_interrupt.[unknown] 0.13 ±105% +8124.3% 10.43 ±111% perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_exc_page_fault.[unknown] 0.01 ±205% +17479.3% 1.70 ± 12% perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_exc_page_fault.[unknown].[unknown] 0.48 ±106% +537.5% 3.07 ± 11% perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown] 0.32 ±107% +241.4% 1.08 ± 38% perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown] 0.08 ±107% +626.3% 0.58 ± 20% perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_call_function_single.[unknown] 0.12 ± 57% +236.4% 0.40 ± 32% perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown] 0.09 ±109% +412.2% 0.46 ± 51% perf-sched.sch_delay.avg.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait 0.12 ±105% +1550.9% 1.90 ± 42% perf-sched.sch_delay.avg.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.hugetlb_fault 0.01 ± 11% -100.0% 0.00 perf-sched.sch_delay.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.__wait_rcu_gp 0.04 ± 69% +1461.9% 0.58 ± 36% perf-sched.sch_delay.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone 0.02 ± 50% +1633.6% 0.37 ± 62% perf-sched.sch_delay.avg.ms.schedule_timeout.kcompactd.kthread.ret_from_fork 0.01 ± 43% +68186.5% 4.21 ±132% perf-sched.sch_delay.avg.ms.schedule_timeout.khugepaged_wait_work.khugepaged.kthread 0.02 ± 97% +3038.3% 0.49 ± 41% perf-sched.sch_delay.avg.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open 0.03 ±215% +9695.3% 3.15 ± 42% perf-sched.sch_delay.max.ms.__cond_resched.__alloc_pages_noprof.alloc_pages_mpol_noprof.folio_alloc_mpol_noprof.vma_alloc_folio_noprof 0.02 ±223% +18158.3% 3.29 ± 43% perf-sched.sch_delay.max.ms.__cond_resched.dput.__fput.__x64_sys_close.do_syscall_64 27.49 ±100% +1156.8% 345.46 ± 69% perf-sched.sch_delay.max.ms.__cond_resched.folio_zero_user.hugetlb_no_page.hugetlb_fault.handle_mm_fault 0.03 ±223% +23123.8% 6.66 ± 66% perf-sched.sch_delay.max.ms.__cond_resched.hugetlb_no_page.hugetlb_fault.handle_mm_fault.do_user_addr_fault 0.05 ±169% +20804.3% 11.25 ± 70% perf-sched.sch_delay.max.ms.__cond_resched.mutex_lock.hugetlb_fault.handle_mm_fault.do_user_addr_fault 0.17 ±108% +2409.6% 4.37 ± 58% perf-sched.sch_delay.max.ms.__cond_resched.stop_one_cpu.migrate_task_to.task_numa_migrate.isra 0.00 ±223% +55471.4% 0.65 ± 88% perf-sched.sch_delay.max.ms.__cond_resched.task_work_run.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown] 0.04 ±210% +6953.3% 2.89 ± 67% perf-sched.sch_delay.max.ms.__cond_resched.zap_pmd_range.isra.0.unmap_page_range 0.07 ± 86% +8775.7% 6.33 ± 12% perf-sched.sch_delay.max.ms.__x64_sys_pause.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown] 0.68 ±193% +754.2% 5.82 ± 28% perf-sched.sch_delay.max.ms.do_nanosleep.hrtimer_nanosleep.common_nsleep.__x64_sys_clock_nanosleep 0.09 ± 62% +102.7% 0.19 ± 14% perf-sched.sch_delay.max.ms.irq_thread.kthread.ret_from_fork.ret_from_fork_asm 0.17 ±127% +85837.7% 150.10 ±137% perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_exc_page_fault.[unknown] 0.01 ±209% +35140.8% 4.46 ± 15% perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_exc_page_fault.[unknown].[unknown] 39.02 ±126% +657.8% 295.69 ± 67% perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown] 1.04 ±126% +520.1% 6.42 ± 39% perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown] 0.30 ±126% +4472.7% 13.93 ± 97% perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_call_function_single.[unknown] 1.13 ±116% +1283.8% 15.68 ± 79% perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown] 0.08 ± 44% +703.6% 0.62 ±170% perf-sched.sch_delay.max.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm 22.08 ±162% +310.8% 90.72 ± 70% perf-sched.sch_delay.max.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait 0.95 ±144% +21640.8% 206.94 ±103% perf-sched.sch_delay.max.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.hugetlb_fault 0.54 ± 7% -100.0% 0.00 perf-sched.sch_delay.max.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.__wait_rcu_gp 1.51 ± 85% +414.6% 7.77 ± 77% perf-sched.sch_delay.max.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone 0.10 ± 52% +2736.0% 2.70 ± 49% perf-sched.sch_delay.max.ms.schedule_timeout.kcompactd.kthread.ret_from_fork 0.01 ± 43% +68186.5% 4.21 ±132% perf-sched.sch_delay.max.ms.schedule_timeout.khugepaged_wait_work.khugepaged.kthread 0.20 ±155% +1877.9% 3.99 ± 20% perf-sched.sch_delay.max.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open 0.09 ± 64% +1395.6% 1.31 ± 21% perf-sched.total_sch_delay.average.ms 0.32 ±123% +1292.2% 4.50 ± 19% perf-sched.wait_and_delay.avg.ms.__cond_resched.folio_zero_user.hugetlb_no_page.hugetlb_fault.handle_mm_fault 639.13 ± 9% -69.6% 194.08 ± 92% perf-sched.wait_and_delay.avg.ms.do_nanosleep.hrtimer_nanosleep.common_nsleep.__x64_sys_clock_nanosleep 0.26 ±148% +2285.7% 6.17 ± 12% perf-sched.wait_and_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown] 0.12 ±148% +612.4% 0.85 ± 33% perf-sched.wait_and_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown] 22.35 ± 33% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.__wait_rcu_gp 4.07 ± 18% +48.6% 6.05 ± 13% perf-sched.wait_and_delay.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread 167.00 ±106% +1946.1% 3417 ± 8% perf-sched.wait_and_delay.count.__cond_resched.folio_zero_user.hugetlb_no_page.hugetlb_fault.handle_mm_fault 3.17 ±103% +268.4% 11.67 ± 45% perf-sched.wait_and_delay.count.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm 334.67 ± 46% +84.4% 617.17 ± 19% perf-sched.wait_and_delay.count.devkmsg_read.vfs_read.ksys_read.do_syscall_64 97.67 ±142% +3017.4% 3044 ± 21% perf-sched.wait_and_delay.count.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown] 39.17 ±144% +2545.5% 1036 ± 34% perf-sched.wait_and_delay.count.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown] 344.33 ± 46% +89.1% 651.17 ± 17% perf-sched.wait_and_delay.count.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait 22121 ± 48% -100.0% 0.00 perf-sched.wait_and_delay.count.schedule_timeout.__wait_for_common.wait_for_completion_state.__wait_rcu_gp 1246 ± 19% -29.3% 881.00 ± 15% perf-sched.wait_and_delay.count.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread 74.50 ±142% +3556.4% 2724 ± 65% perf-sched.wait_and_delay.count.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown] 779.50 ± 3% +28.5% 1001 ± 4% perf-sched.wait_and_delay.count.worker_thread.kthread.ret_from_fork.ret_from_fork_asm 23.15 ±127% +2309.5% 557.71 ± 65% perf-sched.wait_and_delay.max.ms.__cond_resched.folio_zero_user.hugetlb_no_page.hugetlb_fault.handle_mm_fault 7.10 ±152% +7819.7% 562.68 ± 73% perf-sched.wait_and_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown] 1.69 ±172% +1807.7% 32.30 ± 74% perf-sched.wait_and_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown] 329.10 ± 2% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.__wait_rcu_gp 86.41 ± 72% +200.7% 259.84 ± 32% perf-sched.wait_and_delay.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread 3082 ± 15% -40.8% 1823 ± 19% perf-sched.wait_and_delay.max.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm 0.01 ±223% +7452.6% 0.98 ± 52% perf-sched.wait_time.avg.ms.__cond_resched.dput.__fput.__x64_sys_close.do_syscall_64 0.17 ± 85% +1266.7% 2.33 ± 18% perf-sched.wait_time.avg.ms.__cond_resched.folio_zero_user.hugetlb_no_page.hugetlb_fault.handle_mm_fault 0.01 ±223% +18110.7% 2.28 ± 50% perf-sched.wait_time.avg.ms.__cond_resched.hugetlb_no_page.hugetlb_fault.handle_mm_fault.do_user_addr_fault 0.03 ±147% +11280.9% 3.77 ± 34% perf-sched.wait_time.avg.ms.__cond_resched.mutex_lock.hugetlb_fault.handle_mm_fault.do_user_addr_fault 0.17 ±115% +430.6% 0.92 ± 49% perf-sched.wait_time.avg.ms.__cond_resched.stop_one_cpu.migrate_task_to.task_numa_migrate.isra 0.05 ±223% +914.1% 0.49 ± 91% perf-sched.wait_time.avg.ms.__cond_resched.task_work_run.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown] 0.04 ±220% +3132.1% 1.28 ± 77% perf-sched.wait_time.avg.ms.__cond_resched.zap_pmd_range.isra.0.unmap_page_range 639.01 ± 9% -69.7% 193.44 ± 92% perf-sched.wait_time.avg.ms.do_nanosleep.hrtimer_nanosleep.common_nsleep.__x64_sys_clock_nanosleep 6.93 ± 59% +214.4% 21.79 ± 53% perf-sched.wait_time.avg.ms.do_task_dead.do_exit.do_group_exit.__x64_sys_exit_group.x64_sys_call 0.24 ±112% +242.9% 0.82 ± 42% perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_common_interrupt.[unknown] 0.13 ±106% +4417.1% 5.68 ±135% perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_exc_page_fault.[unknown] 0.38 ±105% +711.6% 3.09 ± 12% perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown] 0.14 ± 50% +215.8% 0.45 ± 34% perf-sched.wait_time.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown] 0.19 ± 80% +997.8% 2.08 ± 41% perf-sched.wait_time.avg.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.hugetlb_fault 22.34 ± 33% -100.0% 0.00 perf-sched.wait_time.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.__wait_rcu_gp 1.22 ± 44% +524.5% 7.61 ± 21% perf-sched.wait_time.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone 4.02 ± 18% +46.1% 5.88 ± 11% perf-sched.wait_time.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread 0.01 ±158% +3518.8% 0.29 ± 60% perf-sched.wait_time.avg.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open 3783 ± 31% -57.4% 1612 ± 78% perf-sched.wait_time.max.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity 0.02 ±223% +18158.3% 3.29 ± 43% perf-sched.wait_time.max.ms.__cond_resched.dput.__fput.__x64_sys_close.do_syscall_64 8.25 ±116% +3088.9% 263.05 ± 68% perf-sched.wait_time.max.ms.__cond_resched.folio_zero_user.hugetlb_no_page.hugetlb_fault.handle_mm_fault 0.03 ±223% +23123.8% 6.66 ± 66% perf-sched.wait_time.max.ms.__cond_resched.hugetlb_no_page.hugetlb_fault.handle_mm_fault.do_user_addr_fault 0.05 ±169% +20804.3% 11.25 ± 70% perf-sched.wait_time.max.ms.__cond_resched.mutex_lock.hugetlb_fault.handle_mm_fault.do_user_addr_fault 0.55 ±149% +1198.6% 7.14 ± 56% perf-sched.wait_time.max.ms.__cond_resched.stop_one_cpu.migrate_task_to.task_numa_migrate.isra 0.05 ±223% +2800.0% 1.41 ±161% perf-sched.wait_time.max.ms.__cond_resched.task_work_run.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown] 0.04 ±213% +7040.3% 2.89 ± 67% perf-sched.wait_time.max.ms.__cond_resched.zap_pmd_range.isra.0.unmap_page_range 0.17 ±128% +37273.6% 64.84 ±198% perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_exc_page_fault.[unknown] 29.13 ±164% +865.7% 281.34 ± 73% perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown] 1.72 ± 91% +1062.5% 19.99 ± 54% perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown] 3.12 ± 55% +73.5% 5.41 ± 15% perf-sched.wait_time.max.ms.rcu_gp_kthread.kthread.ret_from_fork.ret_from_fork_asm 2.11 ± 90% +9785.3% 208.45 ±102% perf-sched.wait_time.max.ms.schedule_preempt_disabled.__mutex_lock.constprop.0.hugetlb_fault 329.10 ± 2% -100.0% 0.00 perf-sched.wait_time.max.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.__wait_rcu_gp 10.87 ± 54% +1057.8% 125.85 ± 46% perf-sched.wait_time.max.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone 79.98 ± 83% +191.2% 232.93 ± 19% perf-sched.wait_time.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread 0.20 ±162% +1849.9% 3.80 ± 26% perf-sched.wait_time.max.ms.wait_for_partner.fifo_open.do_dentry_open.vfs_open 3082 ± 15% -40.9% 1821 ± 19% perf-sched.wait_time.max.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki