* Yang Shi <shy828301@xxxxxxxxx> [221220 13:04]: > On Mon, Dec 19, 2022 at 3:30 AM kernel test robot <yujie.liu@xxxxxxxxx> wrote: > > > > Greetings, > > > > Please note that we reported a regression in will-it-scale malloc1 > > benchmark on below commit > > f35b5d7d676e ("mm: align larger anonymous mappings on THP boundaries") > > at > > https://lore.kernel.org/all/202210181535.7144dd15-yujie.liu@xxxxxxxxx/ > > and Nathan reported a kbuild slowdown under clang toolchain at > > https://lore.kernel.org/all/Y1DNQaoPWxE+rGce@dev-arch.thelio-3990X/ > > That commit was finally reverted. > > > > When we tested the revert commit, the score in malloc1 benchmark > > recovered, but we observed another regression in mmap1 benchmark. > > > > "Yin, Fengwei" helped to check and got below clues: > > > > 1. The regression is related with the VMA merge with prev/next > > VMA when doing mmap. > > > > 2. Before the patch reverted, almost all the VMA for 128M mapping > > can't be merged with prev/next VMA. So always create new VMA. > > With the patch reverted, most VMA for 128 mapping can be merged. > > > > It looks like VMA merging introduce more latency comparing to > > creating new VMA. > > > > 3. If force to create new VMA with patch reverted, the result of > > mmap1_thread is restored. > > > > 4. The thp_get_unmapped_area() adds a padding to request mapping > > length. The padding is 2M in general. I believe this padding > > break VMA merging behavior. > > > > 5. No idea about why the difference of the two path (VMA merging > > vs New VMA) is not shown in perf data > > IIRC thp_get_unmapped_area() has been behaving like that for years. > The other change between the problematic commit and the revert commit, > which might have an impact to VMA merging, is maple tree. Did you try to > bisect further? > There was also the work done to vma_merge(). Vlastimil (added to Cc) tracked down an issue with mremap() quite recently [1], which sounds a lot like what is happening here - especially with the padding. > > BTW, is this similar to > https://lore.kernel.org/linux-mm/20221219180857.u6opzhqqbbfxdj3h@revolver/T/#t > ? Yes, it looks to be similar. I'm surprised the mmap1 benchmark was altered with this commit, or am I reading this email incorrectly? The trace below does not seem to show what RedHad [2] found in its testing. [1]. https://lore.kernel.org/all/20221216163227.24648-1-vbabka@xxxxxxx/T/#u [2]. https://bugzilla.redhat.com/show_bug.cgi?id=2149636 > > > > > Please check below report for details. > > > > > > FYI, we noticed a -21.1% regression of will-it-scale.per_thread_ops due to commit: > > > > commit: 0ba09b1733878afe838fe35c310715fda3d46428 ("Revert "mm: align larger anonymous mappings on THP boundaries"") > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master > > > > in testcase: will-it-scale > > on test machine: 104 threads 2 sockets (Skylake) with 192G memory > > with following parameters: > > > > nr_task: 50% > > mode: thread > > test: mmap1 > > cpufreq_governor: performance > > > > test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two. > > test-url: https://github.com/antonblanchard/will-it-scale > > > > In addition to that, the commit also has significant impact on the following tests: > > > > +------------------+------------------------------------------------------------------------------------------------+ > > | testcase: change | will-it-scale: will-it-scale.per_process_ops 1943.6% improvement | > > | test machine | 128 threads 4 sockets Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz (Ice Lake) with 256G memory | > > | test parameters | cpufreq_governor=performance | > > | | mode=process | > > | | nr_task=50% | > > | | test=malloc1 | > > +------------------+------------------------------------------------------------------------------------------------+ > > | testcase: change | unixbench: unixbench.score 2.6% improvement | > > | test machine | 16 threads 1 sockets Intel(R) Xeon(R) E-2278G CPU @ 3.40GHz (Coffee Lake) with 32G memory | > > | test parameters | cpufreq_governor=performance | > > | | nr_task=30% | > > | | runtime=300s | > > | | test=shell8 | > > +------------------+------------------------------------------------------------------------------------------------+ > > | testcase: change | phoronix-test-suite: phoronix-test-suite.build-eigen.0.seconds 9.1% regression | > > | test machine | 96 threads 2 sockets Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz (Cascade Lake) with 512G memory | > > | test parameters | cpufreq_governor=performance | > > | | test=build-eigen-1.1.0 | > > +------------------+------------------------------------------------------------------------------------------------+ > > | testcase: change | will-it-scale: will-it-scale.per_process_ops 2882.9% improvement | > > | test machine | 88 threads 2 sockets Intel(R) Xeon(R) Gold 6238M CPU @ 2.10GHz (Cascade Lake) with 128G memory | > > | test parameters | cpufreq_governor=performance | > > | | mode=process | > > | | nr_task=100% | > > | | test=malloc1 | > > +------------------+------------------------------------------------------------------------------------------------+ > > | testcase: change | will-it-scale: will-it-scale.per_process_ops 12.7% improvement | > > | test machine | 88 threads 2 sockets Intel(R) Xeon(R) Gold 6238M CPU @ 2.10GHz (Cascade Lake) with 128G memory | > > | test parameters | cpufreq_governor=performance | > > | | mode=process | > > | | nr_task=50% | > > | | test=mmap1 | > > +------------------+------------------------------------------------------------------------------------------------+ > > | testcase: change | stress-ng: stress-ng.pthread.ops_per_sec 600.6% improvement | > > | test machine | 128 threads 2 sockets Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz (Ice Lake) with 128G memory | > > | test parameters | class=scheduler | > > | | cpufreq_governor=performance | > > | | nr_threads=100% | > > | | sc_pid_max=4194304 | > > | | test=pthread | > > | | testtime=60s | > > +------------------+------------------------------------------------------------------------------------------------+ > > | testcase: change | will-it-scale: will-it-scale.per_process_ops 601.0% improvement | > > | test machine | 104 threads 2 sockets (Skylake) with 192G memory | > > | test parameters | cpufreq_governor=performance | > > | | mode=process | > > | | nr_task=50% | > > | | test=malloc1 | > > +------------------+------------------------------------------------------------------------------------------------+ > > > > > > Details are as below: > > > > ========================================================================================= > > compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase: > > gcc-11/performance/x86_64-rhel-8.3/thread/50%/debian-11.1-x86_64-20220510.cgz/lkp-skl-fpga01/mmap1/will-it-scale > > > > commit: > > 23393c6461 ("char: tpm: Protect tpm_pm_suspend with locks") > > 0ba09b1733 ("Revert "mm: align larger anonymous mappings on THP boundaries"") > > > > 23393c6461422df5 0ba09b1733878afe838fe35c310 > > ---------------- --------------------------- > > %stddev %change %stddev > > \ | \ > > 140227 -21.1% 110582 ą 3% will-it-scale.52.threads > > 49.74 +0.1% 49.78 will-it-scale.52.threads_idle > > 2696 -21.1% 2126 ą 3% will-it-scale.per_thread_ops > > 301.30 -0.0% 301.26 will-it-scale.time.elapsed_time > > 301.30 -0.0% 301.26 will-it-scale.time.elapsed_time.max > > 3.67 ą 71% -22.7% 2.83 ą 47% will-it-scale.time.involuntary_context_switches > > 0.67 ą165% -75.0% 0.17 ą223% will-it-scale.time.major_page_faults > > 9772 -0.7% 9702 will-it-scale.time.maximum_resident_set_size > > 7274 -0.3% 7254 will-it-scale.time.minor_page_faults > > 4096 +0.0% 4096 will-it-scale.time.page_size > > 0.04 ą 16% -4.0% 0.04 will-it-scale.time.system_time > > 0.06 ą 24% -11.8% 0.05 ą 16% will-it-scale.time.user_time > > 102.83 +1.9% 104.83 ą 2% will-it-scale.time.voluntary_context_switches > > 140227 -21.1% 110582 ą 3% will-it-scale.workload > > 1.582e+10 +0.1% 1.584e+10 cpuidle..time > > 33034032 -0.0% 33021393 cpuidle..usage > > 10.00 +0.0% 10.00 dmesg.bootstage:last > > 172.34 +0.1% 172.58 dmesg.timestamp:last > > 10.00 +0.0% 10.00 kmsg.bootstage:last > > 172.34 +0.1% 172.58 kmsg.timestamp:last > > 362.22 +0.0% 362.25 uptime.boot > > 21363 +0.1% 21389 uptime.idle > > 55.94 +0.2% 56.06 boot-time.boot > > 38.10 +0.2% 38.19 boot-time.dhcp > > 5283 +0.2% 5295 boot-time.idle > > 1.11 -0.1% 1.11 boot-time.smp_boot > > 50.14 +0.0 50.16 mpstat.cpu.all.idle% > > 0.03 ą223% -0.0 0.00 ą223% mpstat.cpu.all.iowait% > > 1.02 +0.0 1.03 mpstat.cpu.all.irq% > > 0.03 ą 4% -0.0 0.02 mpstat.cpu.all.soft% > > 48.59 +0.0 48.61 mpstat.cpu.all.sys% > > 0.20 ą 2% -0.0 0.17 ą 4% mpstat.cpu.all.usr% > > 0.00 -100.0% 0.00 numa-numastat.node0.interleave_hit > > 328352 ą 15% -7.2% 304842 ą 20% numa-numastat.node0.local_node > > 374230 ą 6% -4.2% 358578 ą 7% numa-numastat.node0.numa_hit > > 45881 ą 75% +17.1% 53735 ą 69% numa-numastat.node0.other_node > > 0.00 -100.0% 0.00 numa-numastat.node1.interleave_hit > > 381812 ą 13% +5.9% 404461 ą 14% numa-numastat.node1.local_node > > 430007 ą 5% +3.4% 444810 ą 5% numa-numastat.node1.numa_hit > > 48195 ą 71% -16.3% 40348 ą 92% numa-numastat.node1.other_node > > 301.30 -0.0% 301.26 time.elapsed_time > > 301.30 -0.0% 301.26 time.elapsed_time.max > > 3.67 ą 71% -22.7% 2.83 ą 47% time.involuntary_context_switches > > 0.67 ą165% -75.0% 0.17 ą223% time.major_page_faults > > 9772 -0.7% 9702 time.maximum_resident_set_size > > 7274 -0.3% 7254 time.minor_page_faults > > 4096 +0.0% 4096 time.page_size > > 0.04 ą 16% -4.0% 0.04 time.system_time > > 0.06 ą 24% -11.8% 0.05 ą 16% time.user_time > > 102.83 +1.9% 104.83 ą 2% time.voluntary_context_switches > > 50.00 +0.0% 50.00 vmstat.cpu.id > > 49.00 +0.0% 49.00 vmstat.cpu.sy > > 0.00 -100.0% 0.00 vmstat.cpu.us > > 0.00 -100.0% 0.00 vmstat.cpu.wa > > 12.50 ą100% -66.7% 4.17 ą223% vmstat.io.bi > > 3.33 ą141% -55.0% 1.50 ą223% vmstat.io.bo > > 6.00 ą 47% -16.7% 5.00 ą 44% vmstat.memory.buff > > 4150651 -0.1% 4148516 vmstat.memory.cache > > 1.912e+08 +0.1% 1.913e+08 vmstat.memory.free > > 0.00 -100.0% 0.00 vmstat.procs.b > > 50.50 -0.3% 50.33 vmstat.procs.r > > 8274 ą 2% +1.2% 8371 ą 4% vmstat.system.cs > > 211078 -0.1% 210826 vmstat.system.in > > 1399 +0.0% 1399 turbostat.Avg_MHz > > 50.12 +0.0 50.13 turbostat.Busy% > > 2799 -0.0% 2798 turbostat.Bzy_MHz > > 208677 ą 13% +1112.3% 2529776 ą194% turbostat.C1 > > 0.03 ą 89% +0.3 0.36 ą203% turbostat.C1% > > 27078371 ą 15% -22.0% 21125809 ą 51% turbostat.C1E > > 37.41 ą 33% -9.4 28.04 ą 62% turbostat.C1E% > > 5088326 ą 84% +63.1% 8298766 ą 77% turbostat.C6 > > 12.59 ą 99% +9.1 21.69 ą 78% turbostat.C6% > > 49.79 -0.1% 49.75 turbostat.CPU%c1 > > 0.08 ą 71% +37.3% 0.12 ą 78% turbostat.CPU%c6 > > 43.67 -0.4% 43.50 turbostat.CoreTmp > > 0.03 +0.0% 0.03 turbostat.IPC > > 64483530 -0.2% 64338768 turbostat.IRQ > > 647657 ą 2% +63.2% 1057048 ą 98% turbostat.POLL > > 0.01 +0.0 0.05 ą178% turbostat.POLL% > > 0.01 ą223% +200.0% 0.04 ą147% turbostat.Pkg%pc2 > > 0.01 ą223% +140.0% 0.02 ą165% turbostat.Pkg%pc6 > > 44.17 +0.4% 44.33 turbostat.PkgTmp > > 284.98 +0.1% 285.28 turbostat.PkgWatt > > 26.78 +0.4% 26.89 turbostat.RAMWatt > > 2095 +0.0% 2095 turbostat.TSC_MHz > > 49585 ą 7% +1.1% 50139 ą 7% meminfo.Active > > 49182 ą 7% +1.4% 49889 ą 7% meminfo.Active(anon) > > 402.33 ą 99% -37.9% 250.00 ą123% meminfo.Active(file) > > 290429 -33.7% 192619 meminfo.AnonHugePages > > 419654 -25.9% 311054 meminfo.AnonPages > > 6.00 ą 47% -16.7% 5.00 ą 44% meminfo.Buffers > > 4026046 -0.1% 4023990 meminfo.Cached > > 98360160 +0.0% 98360160 meminfo.CommitLimit > > 4319751 +0.4% 4337801 meminfo.Committed_AS > > 1.877e+08 -0.1% 1.875e+08 meminfo.DirectMap1G > > 14383445 ą 12% +0.7% 14491306 ą 4% meminfo.DirectMap2M > > 1042426 ą 9% +6.4% 1109328 ą 7% meminfo.DirectMap4k > > 4.00 ą141% -50.0% 2.00 ą223% meminfo.Dirty > > 2048 +0.0% 2048 meminfo.Hugepagesize > > 434675 -26.3% 320518 meminfo.Inactive > > 431330 -26.0% 319346 meminfo.Inactive(anon) > > 3344 ą 95% -65.0% 1171 ą186% meminfo.Inactive(file) > > 124528 -0.1% 124460 meminfo.KReclaimable > > 18433 +0.7% 18559 meminfo.KernelStack > > 40185 ą 2% -0.9% 39837 meminfo.Mapped > > 1.903e+08 +0.1% 1.904e+08 meminfo.MemAvailable > > 1.912e+08 +0.1% 1.913e+08 meminfo.MemFree > > 1.967e+08 +0.0% 1.967e+08 meminfo.MemTotal > > 5569412 -1.8% 5466754 meminfo.Memused > > 4763 -5.7% 4489 meminfo.PageTables > > 51956 +0.0% 51956 meminfo.Percpu > > 124528 -0.1% 124460 meminfo.SReclaimable > > 197128 +0.1% 197293 meminfo.SUnreclaim > > 57535 ą 7% +0.8% 57986 ą 6% meminfo.Shmem > > 321657 +0.0% 321754 meminfo.Slab > > 3964769 -0.0% 3964586 meminfo.Unevictable > > 3.436e+10 +0.0% 3.436e+10 meminfo.VmallocTotal > > 280612 +0.1% 280841 meminfo.VmallocUsed > > 6194619 -2.0% 6071944 meminfo.max_used_kB > > 2626 ą 28% -7.7% 2423 ą 11% numa-meminfo.node0.Active > > 2361 ą 20% -5.3% 2236 ą 10% numa-meminfo.node0.Active(anon) > > 264.67 ą117% -29.5% 186.67 ą152% numa-meminfo.node0.Active(file) > > 135041 ą 20% -22.4% 104774 ą 42% numa-meminfo.node0.AnonHugePages > > 197759 ą 18% -20.4% 157470 ą 35% numa-meminfo.node0.AnonPages > > 235746 ą 19% -11.8% 207988 ą 29% numa-meminfo.node0.AnonPages.max > > 2.00 ą223% +0.0% 2.00 ą223% numa-meminfo.node0.Dirty > > 1386137 ą123% +89.5% 2626100 ą 67% numa-meminfo.node0.FilePages > > 202317 ą 19% -21.0% 159846 ą 36% numa-meminfo.node0.Inactive > > 200223 ą 19% -20.7% 158765 ą 35% numa-meminfo.node0.Inactive(anon) > > 2093 ą129% -48.4% 1080 ą200% numa-meminfo.node0.Inactive(file) > > 46369 ą 57% +43.5% 66525 ą 41% numa-meminfo.node0.KReclaimable > > 9395 ą 4% +4.6% 9822 ą 5% numa-meminfo.node0.KernelStack > > 14343 ą101% +65.1% 23681 ą 58% numa-meminfo.node0.Mapped > > 95532160 -1.3% 94306066 numa-meminfo.node0.MemFree > > 97681544 +0.0% 97681544 numa-meminfo.node0.MemTotal > > 2149382 ą 82% +57.0% 3375476 ą 53% numa-meminfo.node0.MemUsed > > 2356 ą 21% -9.9% 2122 ą 9% numa-meminfo.node0.PageTables > > 46369 ą 57% +43.5% 66525 ą 41% numa-meminfo.node0.SReclaimable > > 109141 ą 6% +1.5% 110817 ą 7% numa-meminfo.node0.SUnreclaim > > 4514 ą 34% -22.4% 3505 ą 30% numa-meminfo.node0.Shmem > > 155511 ą 18% +14.0% 177344 ą 14% numa-meminfo.node0.Slab > > 1379264 ą124% +90.1% 2621327 ą 67% numa-meminfo.node0.Unevictable > > 46974 ą 8% +1.5% 47665 ą 7% numa-meminfo.node1.Active > > 46837 ą 8% +1.6% 47601 ą 7% numa-meminfo.node1.Active(anon) > > 137.33 ą219% -54.0% 63.17 ą 85% numa-meminfo.node1.Active(file) > > 155559 ą 18% -43.5% 87865 ą 52% numa-meminfo.node1.AnonHugePages > > 222077 ą 16% -30.8% 153725 ą 36% numa-meminfo.node1.AnonPages > > 304080 ą 17% -27.5% 220544 ą 28% numa-meminfo.node1.AnonPages.max > > 2.00 ą223% -100.0% 0.00 numa-meminfo.node1.Dirty > > 2639873 ą 65% -47.0% 1397913 ą126% numa-meminfo.node1.FilePages > > 232481 ą 17% -30.8% 160887 ą 34% numa-meminfo.node1.Inactive > > 231228 ą 16% -30.5% 160796 ą 34% numa-meminfo.node1.Inactive(anon) > > 1252 ą213% -92.8% 90.33 ą 96% numa-meminfo.node1.Inactive(file) > > 78155 ą 34% -25.9% 57927 ą 47% numa-meminfo.node1.KReclaimable > > 9041 ą 4% -3.3% 8740 ą 5% numa-meminfo.node1.KernelStack > > 25795 ą 55% -37.5% 16118 ą 85% numa-meminfo.node1.Mapped > > 95619356 +1.4% 96947357 numa-meminfo.node1.MemFree > > 99038776 +0.0% 99038776 numa-meminfo.node1.MemTotal > > 3419418 ą 52% -38.8% 2091417 ą 85% numa-meminfo.node1.MemUsed > > 2405 ą 21% -1.5% 2369 ą 7% numa-meminfo.node1.PageTables > > 78155 ą 34% -25.9% 57927 ą 47% numa-meminfo.node1.SReclaimable > > 87984 ą 7% -1.7% 86475 ą 9% numa-meminfo.node1.SUnreclaim > > 52978 ą 9% +2.9% 54500 ą 8% numa-meminfo.node1.Shmem > > 166140 ą 16% -13.1% 144403 ą 17% numa-meminfo.node1.Slab > > 2585504 ą 66% -48.0% 1343258 ą131% numa-meminfo.node1.Unevictable > > 486.17 ą 9% +6.8% 519.17 ą 7% proc-vmstat.direct_map_level2_splits > > 8.00 ą 22% +2.1% 8.17 ą 8% proc-vmstat.direct_map_level3_splits > > 12303 ą 7% +1.3% 12461 ą 7% proc-vmstat.nr_active_anon > > 100.50 ą 99% -37.8% 62.50 ą123% proc-vmstat.nr_active_file > > 104906 -25.9% 77785 proc-vmstat.nr_anon_pages > > 141.00 -33.6% 93.67 proc-vmstat.nr_anon_transparent_hugepages > > 264.00 ą141% -54.3% 120.67 ą223% proc-vmstat.nr_dirtied > > 1.00 ą141% -50.0% 0.50 ą223% proc-vmstat.nr_dirty > > 4750146 +0.1% 4752612 proc-vmstat.nr_dirty_background_threshold > > 9511907 +0.1% 9516846 proc-vmstat.nr_dirty_threshold > > 1006517 -0.1% 1005995 proc-vmstat.nr_file_pages > > 47787985 +0.1% 47813269 proc-vmstat.nr_free_pages > > 107821 -25.9% 79869 proc-vmstat.nr_inactive_anon > > 836.17 ą 95% -65.1% 292.17 ą186% proc-vmstat.nr_inactive_file > > 18434 +0.7% 18563 proc-vmstat.nr_kernel_stack > > 10033 ą 2% -1.1% 9924 proc-vmstat.nr_mapped > > 1190 -5.7% 1122 proc-vmstat.nr_page_table_pages > > 14387 ą 7% +0.7% 14493 ą 6% proc-vmstat.nr_shmem > > 31131 -0.1% 31114 proc-vmstat.nr_slab_reclaimable > > 49281 +0.1% 49323 proc-vmstat.nr_slab_unreclaimable > > 991192 -0.0% 991146 proc-vmstat.nr_unevictable > > 264.00 ą141% -54.3% 120.67 ą223% proc-vmstat.nr_written > > 12303 ą 7% +1.3% 12461 ą 7% proc-vmstat.nr_zone_active_anon > > 100.50 ą 99% -37.8% 62.50 ą123% proc-vmstat.nr_zone_active_file > > 107821 -25.9% 79869 proc-vmstat.nr_zone_inactive_anon > > 836.17 ą 95% -65.1% 292.17 ą186% proc-vmstat.nr_zone_inactive_file > > 991192 -0.0% 991146 proc-vmstat.nr_zone_unevictable > > 1.00 ą141% -50.0% 0.50 ą223% proc-vmstat.nr_zone_write_pending > > 17990 ą 21% -17.6% 14820 ą 46% proc-vmstat.numa_hint_faults > > 7847 ą 37% -41.5% 4588 ą 26% proc-vmstat.numa_hint_faults_local > > 806662 +0.3% 809070 proc-vmstat.numa_hit > > 488.50 ą 13% -73.4% 130.17 ą 22% proc-vmstat.numa_huge_pte_updates > > 0.00 -100.0% 0.00 proc-vmstat.numa_interleave > > 712588 -0.2% 711419 proc-vmstat.numa_local > > 94077 +0.0% 94084 proc-vmstat.numa_other > > 18894 ą 67% -3.1% 18303 ą 41% proc-vmstat.numa_pages_migrated > > 337482 ą 10% -59.0% 138314 ą 10% proc-vmstat.numa_pte_updates > > 61815 -1.6% 60823 proc-vmstat.pgactivate > > 0.00 -100.0% 0.00 proc-vmstat.pgalloc_dma32 > > 933601 -3.8% 898485 proc-vmstat.pgalloc_normal > > 899579 -0.5% 895253 proc-vmstat.pgfault > > 896972 -3.9% 861819 proc-vmstat.pgfree > > 18894 ą 67% -3.1% 18303 ą 41% proc-vmstat.pgmigrate_success > > 3845 ą100% -66.8% 1277 ą223% proc-vmstat.pgpgin > > 1064 ą141% -54.3% 486.67 ą223% proc-vmstat.pgpgout > > 40396 -0.6% 40172 proc-vmstat.pgreuse > > 105.50 -9.2% 95.83 ą 5% proc-vmstat.thp_collapse_alloc > > 57.00 -87.4% 7.17 ą 5% proc-vmstat.thp_deferred_split_page > > 74.83 -72.4% 20.67 ą 4% proc-vmstat.thp_fault_alloc > > 19.50 ą105% -15.4% 16.50 ą 71% proc-vmstat.thp_migration_success > > 57.00 -87.4% 7.17 ą 5% proc-vmstat.thp_split_pmd > > 0.00 -100.0% 0.00 proc-vmstat.thp_zero_page_alloc > > 17.00 +0.0% 17.00 proc-vmstat.unevictable_pgs_culled > > 589.83 ą 21% -5.2% 559.00 ą 10% numa-vmstat.node0.nr_active_anon > > 66.00 ą117% -29.3% 46.67 ą152% numa-vmstat.node0.nr_active_file > > 49406 ą 18% -20.3% 39355 ą 35% numa-vmstat.node0.nr_anon_pages > > 65.17 ą 21% -22.0% 50.83 ą 42% numa-vmstat.node0.nr_anon_transparent_hugepages > > 132.00 ą223% -8.6% 120.67 ą223% numa-vmstat.node0.nr_dirtied > > 0.50 ą223% +0.0% 0.50 ą223% numa-vmstat.node0.nr_dirty > > 346534 ą123% +89.5% 656525 ą 67% numa-vmstat.node0.nr_file_pages > > 23883055 -1.3% 23576561 numa-vmstat.node0.nr_free_pages > > 50051 ą 19% -20.7% 39679 ą 35% numa-vmstat.node0.nr_inactive_anon > > 522.67 ą129% -48.4% 269.67 ą200% numa-vmstat.node0.nr_inactive_file > > 0.00 -100.0% 0.00 numa-vmstat.node0.nr_isolated_anon > > 9392 ą 4% +4.6% 9823 ą 5% numa-vmstat.node0.nr_kernel_stack > > 3594 ą101% +64.8% 5922 ą 58% numa-vmstat.node0.nr_mapped > > 587.83 ą 21% -9.8% 530.00 ą 9% numa-vmstat.node0.nr_page_table_pages > > 1129 ą 34% -22.4% 876.67 ą 30% numa-vmstat.node0.nr_shmem > > 11591 ą 57% +43.5% 16631 ą 41% numa-vmstat.node0.nr_slab_reclaimable > > 27285 ą 6% +1.5% 27704 ą 7% numa-vmstat.node0.nr_slab_unreclaimable > > 344815 ą124% +90.1% 655331 ą 67% numa-vmstat.node0.nr_unevictable > > 132.00 ą223% -8.6% 120.67 ą223% numa-vmstat.node0.nr_written > > 589.83 ą 21% -5.2% 559.00 ą 10% numa-vmstat.node0.nr_zone_active_anon > > 66.00 ą117% -29.3% 46.67 ą152% numa-vmstat.node0.nr_zone_active_file > > 50051 ą 19% -20.7% 39679 ą 35% numa-vmstat.node0.nr_zone_inactive_anon > > 522.67 ą129% -48.4% 269.67 ą200% numa-vmstat.node0.nr_zone_inactive_file > > 344815 ą124% +90.1% 655331 ą 67% numa-vmstat.node0.nr_zone_unevictable > > 0.50 ą223% +0.0% 0.50 ą223% numa-vmstat.node0.nr_zone_write_pending > > 374134 ą 6% -4.1% 358690 ą 7% numa-vmstat.node0.numa_hit > > 0.00 -100.0% 0.00 numa-vmstat.node0.numa_interleave > > 328256 ą 15% -7.1% 304955 ą 20% numa-vmstat.node0.numa_local > > 45881 ą 75% +17.1% 53735 ą 69% numa-vmstat.node0.numa_other > > 11706 ą 8% +1.7% 11901 ą 7% numa-vmstat.node1.nr_active_anon > > 34.17 ą219% -54.1% 15.67 ą 84% numa-vmstat.node1.nr_active_file > > 55500 ą 16% -30.8% 38424 ą 36% numa-vmstat.node1.nr_anon_pages > > 75.50 ą 18% -43.7% 42.50 ą 53% numa-vmstat.node1.nr_anon_transparent_hugepages > > 132.00 ą223% -100.0% 0.00 numa-vmstat.node1.nr_dirtied > > 0.50 ą223% -100.0% 0.00 numa-vmstat.node1.nr_dirty > > 659985 ą 65% -47.0% 349484 ą126% numa-vmstat.node1.nr_file_pages > > 23904828 +1.4% 24236871 numa-vmstat.node1.nr_free_pages > > 57826 ą 16% -30.5% 40197 ą 34% numa-vmstat.node1.nr_inactive_anon > > 313.00 ą213% -92.9% 22.33 ą 96% numa-vmstat.node1.nr_inactive_file > > 9043 ą 4% -3.3% 8740 ą 5% numa-vmstat.node1.nr_kernel_stack > > 6467 ą 55% -37.6% 4038 ą 85% numa-vmstat.node1.nr_mapped > > 601.50 ą 21% -1.6% 591.83 ą 7% numa-vmstat.node1.nr_page_table_pages > > 13261 ą 9% +2.8% 13630 ą 8% numa-vmstat.node1.nr_shmem > > 19538 ą 34% -25.9% 14481 ą 47% numa-vmstat.node1.nr_slab_reclaimable > > 21995 ą 7% -1.7% 21618 ą 9% numa-vmstat.node1.nr_slab_unreclaimable > > 646375 ą 66% -48.0% 335813 ą131% numa-vmstat.node1.nr_unevictable > > 132.00 ą223% -100.0% 0.00 numa-vmstat.node1.nr_written > > 11706 ą 8% +1.7% 11901 ą 7% numa-vmstat.node1.nr_zone_active_anon > > 34.17 ą219% -54.1% 15.67 ą 84% numa-vmstat.node1.nr_zone_active_file > > 57826 ą 16% -30.5% 40197 ą 34% numa-vmstat.node1.nr_zone_inactive_anon > > 313.00 ą213% -92.9% 22.33 ą 96% numa-vmstat.node1.nr_zone_inactive_file > > 646375 ą 66% -48.0% 335813 ą131% numa-vmstat.node1.nr_zone_unevictable > > 0.50 ą223% -100.0% 0.00 numa-vmstat.node1.nr_zone_write_pending > > 429997 ą 5% +3.5% 444962 ą 5% numa-vmstat.node1.numa_hit > > 0.00 -100.0% 0.00 numa-vmstat.node1.numa_interleave > > 381801 ą 13% +6.0% 404613 ą 14% numa-vmstat.node1.numa_local > > 48195 ą 71% -16.3% 40348 ą 92% numa-vmstat.node1.numa_other > > 2.47 ą 2% -2.0% 2.42 ą 5% perf-stat.i.MPKI > > 3.282e+09 +0.7% 3.305e+09 perf-stat.i.branch-instructions > > 0.41 -0.1 0.33 perf-stat.i.branch-miss-rate% > > 13547319 -16.6% 11300609 perf-stat.i.branch-misses > > 42.88 +0.7 43.53 perf-stat.i.cache-miss-rate% > > 17114713 ą 3% +1.4% 17346470 ą 5% perf-stat.i.cache-misses > > 40081707 ą 2% -0.0% 40073189 ą 5% perf-stat.i.cache-references > > 8192 ą 2% +1.4% 8311 ą 4% perf-stat.i.context-switches > > 8.84 -0.8% 8.77 perf-stat.i.cpi > > 104007 +0.0% 104008 perf-stat.i.cpu-clock > > 1.446e+11 +0.1% 1.447e+11 perf-stat.i.cpu-cycles > > 140.10 -1.0% 138.76 perf-stat.i.cpu-migrations > > 8487 ą 3% -0.9% 8412 ą 6% perf-stat.i.cycles-between-cache-misses > > 0.01 ą 6% -0.0 0.01 perf-stat.i.dTLB-load-miss-rate% > > 434358 ą 3% -16.9% 360889 perf-stat.i.dTLB-load-misses > > 4.316e+09 +1.3% 4.373e+09 perf-stat.i.dTLB-loads > > 0.00 ą 15% -0.0 0.00 ą 9% perf-stat.i.dTLB-store-miss-rate% > > 10408 ą 11% -2.6% 10135 ą 8% perf-stat.i.dTLB-store-misses > > 4.302e+08 +5.5% 4.539e+08 perf-stat.i.dTLB-stores > > 16.21 ą 2% -2.5 13.73 ą 18% perf-stat.i.iTLB-load-miss-rate% > > 394805 ą 5% -26.0% 292089 ą 8% perf-stat.i.iTLB-load-misses > > 2041963 ą 3% -8.3% 1872405 ą 12% perf-stat.i.iTLB-loads > > 1.638e+10 +1.0% 1.654e+10 perf-stat.i.instructions > > 41729 ą 6% +37.4% 57323 ą 8% perf-stat.i.instructions-per-iTLB-miss > > 0.11 +0.8% 0.11 perf-stat.i.ipc > > 0.01 ą 55% -1.5% 0.01 ą 85% perf-stat.i.major-faults > > 1.39 +0.1% 1.39 perf-stat.i.metric.GHz > > 468.46 ą 2% -1.5% 461.59 ą 4% perf-stat.i.metric.K/sec > > 77.18 +1.3% 78.18 perf-stat.i.metric.M/sec > > 2473 -0.0% 2472 perf-stat.i.minor-faults > > 89.67 -0.5 89.18 perf-stat.i.node-load-miss-rate% > > 5070484 -10.3% 4547670 perf-stat.i.node-load-misses > > 585336 ą 2% -5.5% 553260 ą 8% perf-stat.i.node-loads > > 98.73 +0.2 98.91 perf-stat.i.node-store-miss-rate% > > 935187 +2.2% 955923 ą 3% perf-stat.i.node-store-misses > > 13301 ą 8% -12.6% 11631 ą 5% perf-stat.i.node-stores > > 2473 -0.0% 2472 perf-stat.i.page-faults > > 104007 +0.0% 104008 perf-stat.i.task-clock > > 2.45 ą 2% -1.0% 2.42 ą 5% perf-stat.overall.MPKI > > 0.41 -0.1 0.34 perf-stat.overall.branch-miss-rate% > > 42.68 +0.6 43.26 perf-stat.overall.cache-miss-rate% > > 8.83 -0.9% 8.75 perf-stat.overall.cpi > > 8459 ą 3% -1.0% 8372 ą 6% perf-stat.overall.cycles-between-cache-misses > > 0.01 ą 3% -0.0 0.01 perf-stat.overall.dTLB-load-miss-rate% > > 0.00 ą 11% -0.0 0.00 ą 8% perf-stat.overall.dTLB-store-miss-rate% > > 16.19 ą 2% -2.5 13.73 ą 18% perf-stat.overall.iTLB-load-miss-rate% > > 41644 ą 6% +37.0% 57047 ą 8% perf-stat.overall.instructions-per-iTLB-miss > > 0.11 +0.9% 0.11 perf-stat.overall.ipc > > 89.65 -0.5 89.15 perf-stat.overall.node-load-miss-rate% > > 98.59 +0.2 98.78 perf-stat.overall.node-store-miss-rate% > > 35314961 +28.0% 45213422 ą 3% perf-stat.overall.path-length > > 3.272e+09 +0.7% 3.295e+09 perf-stat.ps.branch-instructions > > 13563215 -16.5% 11329031 perf-stat.ps.branch-misses > > 17059170 ą 3% +1.3% 17288798 ą 5% perf-stat.ps.cache-misses > > 39960738 ą 2% -0.0% 39951411 ą 5% perf-stat.ps.cache-references > > 8205 ą 2% +1.4% 8320 ą 4% perf-stat.ps.context-switches > > 103658 -0.0% 103657 perf-stat.ps.cpu-clock > > 1.441e+11 +0.1% 1.442e+11 perf-stat.ps.cpu-cycles > > 140.16 -1.0% 138.77 perf-stat.ps.cpu-migrations > > 433133 ą 3% -16.9% 359910 perf-stat.ps.dTLB-load-misses > > 4.302e+09 +1.3% 4.359e+09 perf-stat.ps.dTLB-loads > > 10392 ą 11% -2.6% 10120 ą 8% perf-stat.ps.dTLB-store-misses > > 4.29e+08 +5.5% 4.527e+08 perf-stat.ps.dTLB-stores > > 393499 ą 5% -26.0% 291118 ą 8% perf-stat.ps.iTLB-load-misses > > 2035052 ą 3% -8.3% 1866106 ą 12% perf-stat.ps.iTLB-loads > > 1.633e+10 +1.0% 1.649e+10 perf-stat.ps.instructions > > 0.01 ą 55% +0.1% 0.01 ą 85% perf-stat.ps.major-faults > > 2466 +0.0% 2466 perf-stat.ps.minor-faults > > 5053378 -10.3% 4532205 perf-stat.ps.node-load-misses > > 583428 ą 2% -5.5% 551516 ą 8% perf-stat.ps.node-loads > > 932227 +2.2% 952780 ą 3% perf-stat.ps.node-store-misses > > 13342 ą 8% -12.1% 11729 ą 6% perf-stat.ps.node-stores > > 2466 +0.0% 2466 perf-stat.ps.page-faults > > 103658 -0.0% 103657 perf-stat.ps.task-clock > > 4.952e+12 +0.9% 4.994e+12 perf-stat.total.instructions > > 10.88 ą223% -100.0% 0.00 sched_debug.cfs_rq:/.MIN_vruntime.avg > > 1132 ą223% -100.0% 0.00 sched_debug.cfs_rq:/.MIN_vruntime.max > > 0.00 +0.0% 0.00 sched_debug.cfs_rq:/.MIN_vruntime.min > > 110.47 ą223% -100.0% 0.00 sched_debug.cfs_rq:/.MIN_vruntime.stddev > > 0.53 ą 4% +7.4% 0.57 ą 4% sched_debug.cfs_rq:/.h_nr_running.avg > > 1.03 ą 7% -3.2% 1.00 sched_debug.cfs_rq:/.h_nr_running.max > > 0.45 ą 2% -1.9% 0.44 ą 3% sched_debug.cfs_rq:/.h_nr_running.stddev > > 11896 ą 12% -0.1% 11883 ą 13% sched_debug.cfs_rq:/.load.avg > > 123097 ą123% -80.1% 24487 ą 18% sched_debug.cfs_rq:/.load.max > > 19029 ą 74% -49.9% 9525 ą 13% sched_debug.cfs_rq:/.load.stddev > > 22.63 ą 23% +1.4% 22.93 ą 16% sched_debug.cfs_rq:/.load_avg.avg > > 530.85 ą 73% -13.1% 461.19 ą 43% sched_debug.cfs_rq:/.load_avg.max > > 73.53 ą 46% -7.1% 68.30 ą 33% sched_debug.cfs_rq:/.load_avg.stddev > > 10.88 ą223% -100.0% 0.00 sched_debug.cfs_rq:/.max_vruntime.avg > > 1132 ą223% -100.0% 0.00 sched_debug.cfs_rq:/.max_vruntime.max > > 0.00 +0.0% 0.00 sched_debug.cfs_rq:/.max_vruntime.min > > 110.47 ą223% -100.0% 0.00 sched_debug.cfs_rq:/.max_vruntime.stddev > > 3883756 ą 13% +12.7% 4377466 ą 4% sched_debug.cfs_rq:/.min_vruntime.avg > > 6993455 ą 10% +6.5% 7445221 ą 2% sched_debug.cfs_rq:/.min_vruntime.max > > 219925 ą 60% +43.7% 315970 ą 71% sched_debug.cfs_rq:/.min_vruntime.min > > 2240239 ą 11% +14.0% 2554847 ą 14% sched_debug.cfs_rq:/.min_vruntime.stddev > > 0.53 ą 5% +7.5% 0.57 ą 4% sched_debug.cfs_rq:/.nr_running.avg > > 1.03 ą 7% -3.2% 1.00 sched_debug.cfs_rq:/.nr_running.max > > 0.45 ą 2% -1.9% 0.44 ą 3% sched_debug.cfs_rq:/.nr_running.stddev > > 6.96 ą 55% +26.9% 8.83 ą 45% sched_debug.cfs_rq:/.removed.load_avg.avg > > 305.28 ą 32% +39.3% 425.39 ą 44% sched_debug.cfs_rq:/.removed.load_avg.max > > 42.94 ą 36% +34.4% 57.70 ą 42% sched_debug.cfs_rq:/.removed.load_avg.stddev > > 2.96 ą 58% +39.1% 4.12 ą 48% sched_debug.cfs_rq:/.removed.runnable_avg.avg > > 150.06 ą 34% +44.0% 216.03 ą 45% sched_debug.cfs_rq:/.removed.runnable_avg.max > > 19.33 ą 42% +42.6% 27.56 ą 45% sched_debug.cfs_rq:/.removed.runnable_avg.stddev > > 2.96 ą 58% +39.1% 4.12 ą 48% sched_debug.cfs_rq:/.removed.util_avg.avg > > 150.06 ą 34% +44.0% 216.03 ą 45% sched_debug.cfs_rq:/.removed.util_avg.max > > 19.33 ą 42% +42.6% 27.56 ą 45% sched_debug.cfs_rq:/.removed.util_avg.stddev > > 540.76 ą 6% +7.5% 581.25 ą 5% sched_debug.cfs_rq:/.runnable_avg.avg > > 1060 ą 2% +2.5% 1087 ą 3% sched_debug.cfs_rq:/.runnable_avg.max > > 442.07 ą 4% -0.1% 441.69 ą 5% sched_debug.cfs_rq:/.runnable_avg.stddev > > 3123464 ą 14% +10.0% 3436745 ą 3% sched_debug.cfs_rq:/.spread0.avg > > 6233151 ą 10% +4.4% 6504505 ą 3% sched_debug.cfs_rq:/.spread0.max > > -540338 +15.6% -624739 sched_debug.cfs_rq:/.spread0.min > > 2240217 ą 11% +14.0% 2554844 ą 14% sched_debug.cfs_rq:/.spread0.stddev > > 540.71 ą 6% +7.5% 581.22 ą 5% sched_debug.cfs_rq:/.util_avg.avg > > 1060 ą 2% +2.5% 1086 ą 3% sched_debug.cfs_rq:/.util_avg.max > > 442.07 ą 4% -0.1% 441.67 ą 5% sched_debug.cfs_rq:/.util_avg.stddev > > 454.69 ą 6% +7.0% 486.47 ą 8% sched_debug.cfs_rq:/.util_est_enqueued.avg > > 1024 -0.0% 1023 sched_debug.cfs_rq:/.util_est_enqueued.max > > 396.02 ą 2% -0.1% 395.79 sched_debug.cfs_rq:/.util_est_enqueued.stddev > > 642171 ą 4% +16.6% 748912 ą 2% sched_debug.cpu.avg_idle.avg > > 1051166 -1.2% 1038098 sched_debug.cpu.avg_idle.max > > 2402 ą 5% +28.5% 3088 ą 9% sched_debug.cpu.avg_idle.min > > 384501 ą 3% -12.3% 337306 ą 5% sched_debug.cpu.avg_idle.stddev > > 198632 ą 7% +5.1% 208788 sched_debug.cpu.clock.avg > > 198638 ą 7% +5.1% 208794 sched_debug.cpu.clock.max > > 198626 ą 7% +5.1% 208783 sched_debug.cpu.clock.min > > 3.25 +2.3% 3.32 ą 5% sched_debug.cpu.clock.stddev > > 196832 ą 7% +5.1% 206882 sched_debug.cpu.clock_task.avg > > 197235 ą 7% +5.1% 207282 sched_debug.cpu.clock_task.max > > 181004 ą 7% +5.7% 191329 sched_debug.cpu.clock_task.min > > 1575 ą 3% -1.8% 1546 sched_debug.cpu.clock_task.stddev > > 2411 ą 4% +2.8% 2478 sched_debug.cpu.curr->pid.avg > > 8665 ą 4% +3.1% 8935 sched_debug.cpu.curr->pid.max > > 2522 ą 2% +1.0% 2548 sched_debug.cpu.curr->pid.stddev > > 501318 -0.0% 501249 sched_debug.cpu.max_idle_balance_cost.avg > > 528365 +0.5% 531236 ą 2% sched_debug.cpu.max_idle_balance_cost.max > > 500000 +0.0% 500000 sched_debug.cpu.max_idle_balance_cost.min > > 5157 ą 19% -4.2% 4941 ą 23% sched_debug.cpu.max_idle_balance_cost.stddev > > 4294 +0.0% 4294 sched_debug.cpu.next_balance.avg > > 4294 +0.0% 4294 sched_debug.cpu.next_balance.max > > 4294 +0.0% 4294 sched_debug.cpu.next_balance.min > > 0.00 ą 41% -40.0% 0.00 ą 13% sched_debug.cpu.next_balance.stddev > > 0.44 ą 4% +2.4% 0.45 sched_debug.cpu.nr_running.avg > > 1.00 +0.0% 1.00 sched_debug.cpu.nr_running.max > > 0.47 +0.5% 0.47 sched_debug.cpu.nr_running.stddev > > 14345 ą 8% +6.7% 15305 ą 4% sched_debug.cpu.nr_switches.avg > > 30800 ą 8% +34.5% 41437 ą 10% sched_debug.cpu.nr_switches.max > > 4563 ą 28% +5.7% 4822 ą 25% sched_debug.cpu.nr_switches.min > > 5491 ą 8% +26.4% 6941 ą 10% sched_debug.cpu.nr_switches.stddev > > 2.111e+09 ą 7% +1.5% 2.142e+09 ą 6% sched_debug.cpu.nr_uninterruptible.avg > > 4.295e+09 +0.0% 4.295e+09 sched_debug.cpu.nr_uninterruptible.max > > 2.14e+09 +0.1% 2.143e+09 sched_debug.cpu.nr_uninterruptible.stddev > > 198627 ą 7% +5.1% 208783 sched_debug.cpu_clk > > 996147 +0.0% 996147 sched_debug.dl_rq:.dl_bw->bw.avg > > 996147 +0.0% 996147 sched_debug.dl_rq:.dl_bw->bw.max > > 996147 +0.0% 996147 sched_debug.dl_rq:.dl_bw->bw.min > > 4.295e+09 +0.0% 4.295e+09 sched_debug.jiffies > > 198022 ą 7% +5.1% 208178 sched_debug.ktime > > 950.00 +0.0% 950.00 sched_debug.rt_rq:.rt_runtime.avg > > 950.00 +0.0% 950.00 sched_debug.rt_rq:.rt_runtime.max > > 950.00 +0.0% 950.00 sched_debug.rt_rq:.rt_runtime.min > > 199377 ą 7% +5.1% 209531 sched_debug.sched_clk > > 1.00 +0.0% 1.00 sched_debug.sched_clock_stable() > > 58611259 +0.0% 58611259 sched_debug.sysctl_sched.sysctl_sched_features > > 0.75 +0.0% 0.75 sched_debug.sysctl_sched.sysctl_sched_idle_min_granularity > > 24.00 +0.0% 24.00 sched_debug.sysctl_sched.sysctl_sched_latency > > 3.00 +0.0% 3.00 sched_debug.sysctl_sched.sysctl_sched_min_granularity > > 1.00 +0.0% 1.00 sched_debug.sysctl_sched.sysctl_sched_tunable_scaling > > 4.00 +0.0% 4.00 sched_debug.sysctl_sched.sysctl_sched_wakeup_granularity > > 20.90 ą 47% -6.4 14.49 ą100% perf-profile.calltrace.cycles-pp.mwait_idle_with_hints.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call > > 20.90 ą 47% -6.4 14.49 ą100% perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle > > 0.48 ą 44% -0.5 0.00 perf-profile.calltrace.cycles-pp.do_mmap.vm_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe.__mmap > > 29.41 ą 19% -0.2 29.23 ą 18% perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry > > 35.02 ą 8% -0.2 34.86 ą 7% perf-profile.calltrace.cycles-pp.__mmap > > 34.95 ą 8% -0.1 34.81 ą 7% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__mmap > > 34.92 ą 8% -0.1 34.79 ą 7% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__mmap > > 34.87 ą 8% -0.1 34.74 ą 7% perf-profile.calltrace.cycles-pp.vm_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe.__mmap > > 0.41 ą 74% -0.1 0.30 ą156% perf-profile.calltrace.cycles-pp.cpu_startup_entry.rest_init.arch_call_rest_init.start_kernel.secondary_startup_64_no_verify > > 0.41 ą 74% -0.1 0.30 ą156% perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.rest_init.arch_call_rest_init.start_kernel > > 0.41 ą 74% -0.1 0.30 ą156% perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.rest_init.arch_call_rest_init > > 0.41 ą 74% -0.1 0.30 ą156% perf-profile.calltrace.cycles-pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.rest_init > > 0.41 ą 74% -0.1 0.30 ą156% perf-profile.calltrace.cycles-pp.start_kernel.secondary_startup_64_no_verify > > 0.41 ą 74% -0.1 0.30 ą156% perf-profile.calltrace.cycles-pp.arch_call_rest_init.start_kernel.secondary_startup_64_no_verify > > 0.41 ą 74% -0.1 0.30 ą156% perf-profile.calltrace.cycles-pp.rest_init.arch_call_rest_init.start_kernel.secondary_startup_64_no_verify > > 29.59 ą 19% -0.1 29.50 ą 17% perf-profile.calltrace.cycles-pp.secondary_startup_64_no_verify > > 29.03 ą 19% -0.1 28.95 ą 17% perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify > > 29.03 ą 19% -0.1 28.95 ą 17% perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify > > 29.03 ą 19% -0.1 28.95 ą 17% perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64_no_verify > > 29.00 ą 19% -0.1 28.93 ą 17% perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify > > 29.00 ą 19% -0.1 28.93 ą 17% perf-profile.calltrace.cycles-pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary > > 33.56 ą 8% -0.0 33.53 ą 7% perf-profile.calltrace.cycles-pp.osq_lock.rwsem_optimistic_spin.rwsem_down_write_slowpath.down_write_killable.vm_mmap_pgoff > > 34.26 ą 8% -0.0 34.24 ą 7% perf-profile.calltrace.cycles-pp.down_write_killable.vm_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe.__mmap > > 34.23 ą 8% -0.0 34.21 ą 7% perf-profile.calltrace.cycles-pp.rwsem_down_write_slowpath.down_write_killable.vm_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe > > 34.19 ą 8% -0.0 34.18 ą 7% perf-profile.calltrace.cycles-pp.rwsem_optimistic_spin.rwsem_down_write_slowpath.down_write_killable.vm_mmap_pgoff.do_syscall_64 > > 0.44 ą 44% +0.0 0.48 ą 44% perf-profile.calltrace.cycles-pp.rwsem_spin_on_owner.rwsem_optimistic_spin.rwsem_down_write_slowpath.down_write_killable.__vm_munmap > > 0.45 ą 44% +0.0 0.48 ą 44% perf-profile.calltrace.cycles-pp.rwsem_spin_on_owner.rwsem_optimistic_spin.rwsem_down_write_slowpath.down_write_killable.vm_mmap_pgoff > > 33.62 ą 8% +0.1 33.71 ą 7% perf-profile.calltrace.cycles-pp.osq_lock.rwsem_optimistic_spin.rwsem_down_write_slowpath.down_write_killable.__vm_munmap > > 34.32 ą 8% +0.1 34.42 ą 7% perf-profile.calltrace.cycles-pp.down_write_killable.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe > > 34.29 ą 8% +0.1 34.39 ą 7% perf-profile.calltrace.cycles-pp.rwsem_down_write_slowpath.down_write_killable.__vm_munmap.__x64_sys_munmap.do_syscall_64 > > 34.25 ą 8% +0.1 34.36 ą 7% perf-profile.calltrace.cycles-pp.rwsem_optimistic_spin.rwsem_down_write_slowpath.down_write_killable.__vm_munmap.__x64_sys_munmap > > 35.11 ą 8% +0.2 35.31 ą 7% perf-profile.calltrace.cycles-pp.__munmap > > 35.04 ą 8% +0.2 35.25 ą 7% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__munmap > > 35.02 ą 8% +0.2 35.24 ą 7% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap > > 0.00 +0.2 0.22 ą223% perf-profile.calltrace.cycles-pp.intel_idle_irq.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle > > 34.97 ą 8% +0.2 35.20 ą 7% perf-profile.calltrace.cycles-pp.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap > > 34.97 ą 8% +0.2 35.20 ą 7% perf-profile.calltrace.cycles-pp.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap > > 0.47 ą 44% +0.2 0.70 ą 7% perf-profile.calltrace.cycles-pp.do_mas_align_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe > > 0.00 +0.4 0.44 ą223% perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.mwait_idle_with_hints.intel_idle_irq.cpuidle_enter_state.cpuidle_enter > > 8.27 ą 91% +6.2 14.46 ą 77% perf-profile.calltrace.cycles-pp.mwait_idle_with_hints.intel_idle_ibrs.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call > > 8.27 ą 91% +6.2 14.46 ą 77% perf-profile.calltrace.cycles-pp.intel_idle_ibrs.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle > > 21.09 ą 47% -6.5 14.62 ą 99% perf-profile.children.cycles-pp.intel_idle > > 35.02 ą 8% -0.2 34.86 ą 7% perf-profile.children.cycles-pp.__mmap > > 0.14 ą 9% -0.1 0.00 perf-profile.children.cycles-pp.thp_get_unmapped_area > > 34.87 ą 8% -0.1 34.74 ą 7% perf-profile.children.cycles-pp.vm_mmap_pgoff > > 0.55 ą 9% -0.1 0.46 ą 7% perf-profile.children.cycles-pp.do_mmap > > 29.59 ą 19% -0.1 29.50 ą 17% perf-profile.children.cycles-pp.secondary_startup_64_no_verify > > 29.59 ą 19% -0.1 29.50 ą 17% perf-profile.children.cycles-pp.cpu_startup_entry > > 29.59 ą 19% -0.1 29.50 ą 17% perf-profile.children.cycles-pp.do_idle > > 29.03 ą 19% -0.1 28.95 ą 17% perf-profile.children.cycles-pp.start_secondary > > 29.56 ą 19% -0.1 29.49 ą 17% perf-profile.children.cycles-pp.cpuidle_idle_call > > 29.56 ą 19% -0.1 29.48 ą 17% perf-profile.children.cycles-pp.cpuidle_enter > > 29.56 ą 19% -0.1 29.48 ą 17% perf-profile.children.cycles-pp.cpuidle_enter_state > > 29.52 ą 19% -0.1 29.45 ą 17% perf-profile.children.cycles-pp.mwait_idle_with_hints > > 0.38 ą 9% -0.1 0.32 ą 6% perf-profile.children.cycles-pp.mmap_region > > 0.05 ą 7% -0.1 0.00 perf-profile.children.cycles-pp.unmap_vmas > > 0.11 ą 8% -0.1 0.06 ą 13% perf-profile.children.cycles-pp.unmap_region > > 0.16 ą 10% -0.0 0.13 ą 9% perf-profile.children.cycles-pp.get_unmapped_area > > 0.07 ą 7% -0.0 0.03 ą 70% perf-profile.children.cycles-pp.mas_find > > 0.05 ą 44% -0.0 0.02 ą141% perf-profile.children.cycles-pp.mas_wr_node_store > > 0.10 ą 10% -0.0 0.07 ą 14% perf-profile.children.cycles-pp.mas_spanning_rebalance > > 0.14 ą 9% -0.0 0.11 ą 9% perf-profile.children.cycles-pp.arch_get_unmapped_area_topdown > > 0.06 ą 11% -0.0 0.04 ą 72% perf-profile.children.cycles-pp.__schedule > > 0.14 ą 10% -0.0 0.11 ą 9% perf-profile.children.cycles-pp.vm_unmapped_area > > 0.07 ą 10% -0.0 0.04 ą 45% perf-profile.children.cycles-pp.do_mas_munmap > > 0.02 ą 99% -0.0 0.00 perf-profile.children.cycles-pp.mas_next_entry > > 0.04 ą 44% -0.0 0.02 ą141% perf-profile.children.cycles-pp.schedule > > 0.06 ą 9% -0.0 0.04 ą 71% perf-profile.children.cycles-pp.mas_wr_modify > > 0.10 ą 8% -0.0 0.08 ą 11% perf-profile.children.cycles-pp.mas_rev_awalk > > 0.10 ą 12% -0.0 0.08 ą 16% perf-profile.children.cycles-pp.mas_wr_spanning_store > > 0.06 ą 7% -0.0 0.04 ą 45% perf-profile.children.cycles-pp.mas_walk > > 0.09 ą 11% -0.0 0.08 ą 16% perf-profile.children.cycles-pp.syscall_exit_to_user_mode > > 0.02 ą141% -0.0 0.00 perf-profile.children.cycles-pp.perf_event_mmap > > 0.02 ą141% -0.0 0.00 perf-profile.children.cycles-pp.unmap_page_range > > 0.11 ą 26% -0.0 0.10 ą 10% perf-profile.children.cycles-pp.__get_user_nocheck_8 > > 0.35 ą 19% -0.0 0.34 ą 11% perf-profile.children.cycles-pp.perf_tp_event > > 0.11 ą 26% -0.0 0.10 ą 11% perf-profile.children.cycles-pp.perf_callchain_user > > 0.34 ą 19% -0.0 0.33 ą 10% perf-profile.children.cycles-pp.__perf_event_overflow > > 0.34 ą 19% -0.0 0.33 ą 10% perf-profile.children.cycles-pp.perf_event_output_forward > > 0.31 ą 19% -0.0 0.30 ą 12% perf-profile.children.cycles-pp.perf_prepare_sample > > 0.30 ą 19% -0.0 0.29 ą 10% perf-profile.children.cycles-pp.perf_callchain > > 0.30 ą 19% -0.0 0.29 ą 10% perf-profile.children.cycles-pp.get_perf_callchain > > 0.12 ą 9% -0.0 0.11 ą 9% perf-profile.children.cycles-pp.mas_empty_area_rev > > 0.08 ą 7% -0.0 0.07 ą 8% perf-profile.children.cycles-pp.syscall_return_via_sysret > > 0.01 ą223% -0.0 0.00 perf-profile.children.cycles-pp.mas_wr_bnode > > 0.01 ą223% -0.0 0.00 perf-profile.children.cycles-pp.perf_event_mmap_event > > 0.01 ą223% -0.0 0.00 perf-profile.children.cycles-pp.__entry_text_start > > 0.33 ą 10% -0.0 0.32 ą 7% perf-profile.children.cycles-pp.mas_store_prealloc > > 0.32 ą 20% -0.0 0.32 ą 10% perf-profile.children.cycles-pp.update_curr > > 0.32 ą 19% -0.0 0.31 ą 11% perf-profile.children.cycles-pp.perf_trace_sched_stat_runtime > > 0.56 ą 22% -0.0 0.56 ą 58% perf-profile.children.cycles-pp.start_kernel > > 0.56 ą 22% -0.0 0.56 ą 58% perf-profile.children.cycles-pp.arch_call_rest_init > > 0.56 ą 22% -0.0 0.56 ą 58% perf-profile.children.cycles-pp.rest_init > > 0.07 ą 45% -0.0 0.07 ą 11% perf-profile.children.cycles-pp.native_irq_return_iret > > 0.01 ą223% +0.0 0.01 ą223% perf-profile.children.cycles-pp.ktime_get_update_offsets_now > > 0.06 ą 45% +0.0 0.06 ą 8% perf-profile.children.cycles-pp.asm_exc_page_fault > > 0.18 ą 16% +0.0 0.18 ą 14% perf-profile.children.cycles-pp.perf_callchain_kernel > > 0.12 ą 16% +0.0 0.12 ą 12% perf-profile.children.cycles-pp.unwind_next_frame > > 0.36 ą 18% +0.0 0.37 ą 10% perf-profile.children.cycles-pp.task_tick_fair > > 0.58 ą 14% +0.0 0.58 ą 10% perf-profile.children.cycles-pp.hrtimer_interrupt > > 0.49 ą 14% +0.0 0.50 ą 11% perf-profile.children.cycles-pp.__hrtimer_run_queues > > 0.05 ą 46% +0.0 0.05 ą 45% perf-profile.children.cycles-pp.__unwind_start > > 0.45 ą 14% +0.0 0.46 ą 11% perf-profile.children.cycles-pp.tick_sched_handle > > 0.46 ą 14% +0.0 0.46 ą 11% perf-profile.children.cycles-pp.tick_sched_timer > > 0.45 ą 15% +0.0 0.45 ą 11% perf-profile.children.cycles-pp.update_process_times > > 0.06 ą 11% +0.0 0.07 ą 12% perf-profile.children.cycles-pp.kmem_cache_free_bulk > > 0.58 ą 14% +0.0 0.58 ą 10% perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt > > 0.00 +0.0 0.01 ą223% perf-profile.children.cycles-pp.record__mmap_read_evlist > > 0.00 +0.0 0.01 ą223% perf-profile.children.cycles-pp.perf_mmap__push > > 0.00 +0.0 0.01 ą223% perf-profile.children.cycles-pp.record__pushfn > > 0.00 +0.0 0.01 ą223% perf-profile.children.cycles-pp.ksys_write > > 0.00 +0.0 0.01 ą223% perf-profile.children.cycles-pp.vfs_write > > 0.00 +0.0 0.01 ą223% perf-profile.children.cycles-pp.__libc_write > > 0.00 +0.0 0.01 ą223% perf-profile.children.cycles-pp.generic_file_write_iter > > 0.00 +0.0 0.01 ą223% perf-profile.children.cycles-pp.__generic_file_write_iter > > 0.00 +0.0 0.01 ą223% perf-profile.children.cycles-pp.generic_perform_write > > 0.00 +0.0 0.01 ą223% perf-profile.children.cycles-pp.build_id__mark_dso_hit > > 0.39 ą 17% +0.0 0.40 ą 10% perf-profile.children.cycles-pp.scheduler_tick > > 0.00 +0.0 0.01 ą223% perf-profile.children.cycles-pp.clockevents_program_event > > 0.05 ą 45% +0.0 0.06 ą 11% perf-profile.children.cycles-pp.mas_wr_store_entry > > 0.60 ą 14% +0.0 0.61 ą 9% perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt > > 0.08 ą 8% +0.0 0.10 ą 12% perf-profile.children.cycles-pp.mas_destroy > > 0.08 ą 9% +0.0 0.09 ą 21% perf-profile.children.cycles-pp.perf_session__deliver_event > > 0.08 ą 12% +0.0 0.09 ą 33% perf-profile.children.cycles-pp.ordered_events__queue > > 0.08 ą 11% +0.0 0.10 ą 22% perf-profile.children.cycles-pp.__ordered_events__flush > > 0.08 ą 9% +0.0 0.10 ą 22% perf-profile.children.cycles-pp.perf_session__process_user_event > > 0.06 ą 13% +0.0 0.08 ą 14% perf-profile.children.cycles-pp.kmem_cache_alloc > > 0.07 ą 9% +0.0 0.09 ą 33% perf-profile.children.cycles-pp.queue_event > > 0.08 ą 8% +0.0 0.10 ą 31% perf-profile.children.cycles-pp.process_simple > > 0.00 +0.0 0.03 ą100% perf-profile.children.cycles-pp.evlist__parse_sample > > 0.06 ą 6% +0.0 0.08 ą 8% perf-profile.children.cycles-pp.memset_erms > > 0.22 ą 7% +0.0 0.26 ą 23% perf-profile.children.cycles-pp.__libc_start_main > > 0.22 ą 7% +0.0 0.26 ą 23% perf-profile.children.cycles-pp.main > > 0.22 ą 7% +0.0 0.26 ą 23% perf-profile.children.cycles-pp.run_builtin > > 0.21 ą 9% +0.0 0.25 ą 23% perf-profile.children.cycles-pp.cmd_record > > 0.21 ą 9% +0.0 0.25 ą 23% perf-profile.children.cycles-pp.__cmd_record > > 0.20 ą 9% +0.0 0.24 ą 24% perf-profile.children.cycles-pp.cmd_sched > > 0.17 ą 11% +0.0 0.21 ą 25% perf-profile.children.cycles-pp.reader__read_event > > 0.17 ą 11% +0.0 0.21 ą 26% perf-profile.children.cycles-pp.record__finish_output > > 0.17 ą 11% +0.0 0.21 ą 26% perf-profile.children.cycles-pp.perf_session__process_events > > 0.00 +0.0 0.04 ą 45% perf-profile.children.cycles-pp.kmem_cache_free > > 0.17 ą 7% +0.1 0.22 ą 8% perf-profile.children.cycles-pp.mas_alloc_nodes > > 0.11 ą 9% +0.1 0.17 ą 6% perf-profile.children.cycles-pp.kmem_cache_alloc_bulk > > 0.00 +0.1 0.06 ą 13% perf-profile.children.cycles-pp.vm_area_dup > > 0.16 ą 8% +0.1 0.22 ą 6% perf-profile.children.cycles-pp.mas_preallocate > > 67.20 ą 8% +0.1 67.28 ą 7% perf-profile.children.cycles-pp.osq_lock > > 68.59 ą 8% +0.1 68.66 ą 7% perf-profile.children.cycles-pp.down_write_killable > > 1.04 ą 8% +0.1 1.12 ą 7% perf-profile.children.cycles-pp.rwsem_spin_on_owner > > 70.08 ą 8% +0.1 70.15 ą 7% perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe > > 68.52 ą 8% +0.1 68.60 ą 7% perf-profile.children.cycles-pp.rwsem_down_write_slowpath > > 70.03 ą 8% +0.1 70.11 ą 7% perf-profile.children.cycles-pp.do_syscall_64 > > 68.46 ą 8% +0.1 68.55 ą 7% perf-profile.children.cycles-pp.rwsem_optimistic_spin > > 0.55 ą 8% +0.2 0.71 ą 8% perf-profile.children.cycles-pp.do_mas_align_munmap > > 35.12 ą 8% +0.2 35.31 ą 7% perf-profile.children.cycles-pp.__munmap > > 0.00 +0.2 0.22 ą 7% perf-profile.children.cycles-pp.vma_expand > > 0.00 +0.2 0.22 ą223% perf-profile.children.cycles-pp.intel_idle_irq > > 34.98 ą 8% +0.2 35.20 ą 7% perf-profile.children.cycles-pp.__x64_sys_munmap > > 34.97 ą 8% +0.2 35.20 ą 7% perf-profile.children.cycles-pp.__vm_munmap > > 0.64 ą 13% +0.2 0.88 ą 55% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt > > 0.00 +0.3 0.30 ą 7% perf-profile.children.cycles-pp.__vma_adjust > > 0.00 +0.4 0.36 ą 6% perf-profile.children.cycles-pp.__split_vma > > 8.42 ą 91% +6.2 14.60 ą 77% perf-profile.children.cycles-pp.intel_idle_ibrs > > 29.52 ą 19% -0.1 29.45 ą 17% perf-profile.self.cycles-pp.mwait_idle_with_hints > > 0.18 ą 9% -0.1 0.12 ą 10% perf-profile.self.cycles-pp.rwsem_optimistic_spin > > 0.04 ą 45% -0.0 0.00 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe > > 0.04 ą 44% -0.0 0.00 perf-profile.self.cycles-pp.mmap_region > > 0.10 ą 5% -0.0 0.08 ą 9% perf-profile.self.cycles-pp.mas_rev_awalk > > 0.06 ą 7% -0.0 0.04 ą 45% perf-profile.self.cycles-pp.mas_walk > > 0.06 ą 11% -0.0 0.04 ą 45% perf-profile.self.cycles-pp.do_mas_align_munmap > > 0.08 ą 8% -0.0 0.07 ą 14% perf-profile.self.cycles-pp.syscall_exit_to_user_mode > > 0.08 ą 7% -0.0 0.07 ą 8% perf-profile.self.cycles-pp.syscall_return_via_sysret > > 0.06 ą 13% -0.0 0.05 ą 7% perf-profile.self.cycles-pp.down_write_killable > > 0.07 ą 45% -0.0 0.07 ą 11% perf-profile.self.cycles-pp.native_irq_return_iret > > 0.05 ą 45% -0.0 0.05 ą 47% perf-profile.self.cycles-pp.unwind_next_frame > > 0.00 +0.0 0.01 ą223% perf-profile.self.cycles-pp.ktime_get_update_offsets_now > > 0.05 ą 45% +0.0 0.06 ą 11% perf-profile.self.cycles-pp.kmem_cache_free_bulk > > 0.00 +0.0 0.02 ą141% perf-profile.self.cycles-pp.kmem_cache_free > > 0.07 ą 8% +0.0 0.09 ą 33% perf-profile.self.cycles-pp.queue_event > > 0.06 ą 8% +0.0 0.08 ą 8% perf-profile.self.cycles-pp.memset_erms > > 0.04 ą 45% +0.0 0.08 ą 6% perf-profile.self.cycles-pp.kmem_cache_alloc_bulk > > 66.61 ą 8% +0.1 66.68 ą 7% perf-profile.self.cycles-pp.osq_lock > > 1.02 ą 8% +0.1 1.10 ą 7% perf-profile.self.cycles-pp.rwsem_spin_on_owner > > > > > > > > If you fix the issue, kindly add following tag > > | Reported-by: kernel test robot <yujie.liu@xxxxxxxxx> > > | Link: https://lore.kernel.org/oe-lkp/202212151657.5d11a672-yujie.liu@xxxxxxxxx > > > > > > To reproduce: > > > > git clone https://github.com/intel/lkp-tests.git > > cd lkp-tests > > sudo bin/lkp install job.yaml # job file is attached in this email > > bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run > > sudo bin/lkp run generated-yaml-file > > > > # if come across any failure that blocks the test, > > # please remove ~/.lkp and /lkp dir to run from a clean state. > > > > > > Disclaimer: > > Results have been estimated based on internal Intel analysis and are provided > > for informational purposes only. Any difference in system hardware or software > > design or configuration may affect actual performance. > > > > > > -- > > 0-DAY CI Kernel Test Service > > https://01.org/lkp