On Fri, Jul 19, 2024 at 2:44 AM Oliver Sang <oliver.sang@xxxxxxxxx> wrote: > > hi, Yu Zhao, > > On Wed, Jul 17, 2024 at 09:44:33AM -0600, Yu Zhao wrote: > > On Wed, Jul 17, 2024 at 2:36 AM Yu Zhao <yuzhao@xxxxxxxxxx> wrote: > > > > > > Hi Janosch and Oliver, > > > > > > On Wed, Jul 17, 2024 at 1:57 AM Janosch Frank <frankja@xxxxxxxxxxxxx> wrote: > > > > > > > > On 7/9/24 07:11, kernel test robot wrote: > > > > > Hello, > > > > > > > > > > kernel test robot noticed a -34.3% regression of vm-scalability.throughput on: > > > > > > > > > > > > > > > commit: 875fa64577da9bc8e9963ee14fef8433f20653e7 ("mm/hugetlb_vmemmap: fix race with speculative PFN walkers") > > > > > https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master > > > > > > > > > > [still regression on linux-next/master 0b58e108042b0ed28a71cd7edf5175999955b233] > > > > > > > > > This has hit s390 huge page backed KVM guests as well. > > > > Our simple start/stop test case went from ~5 to over 50 seconds of runtime. > > > > > > Could you try the attached patch please? Thank you. > > > > Thanks, Yosry, for spotting the following typo: > > flags &= VMEMMAP_SYNCHRONIZE_RCU; > > It's supposed to be: > > flags &= ~VMEMMAP_SYNCHRONIZE_RCU; > > > > Reattaching v2 with the above typo fixed. Please let me know, Janosch & Oliver. > > since the commit is in mainline now, I directly apply your v2 patch upon > bd225530a4c71 ("mm/hugetlb_vmemmap: fix race with speculative PFN walkers") > > in our tests, your v2 patch not only recovers the performance regression, Thanks for verifying the fix! > it even has +13.7% performance improvement than 5a4d8944d6b1e (parent of > bd225530a4c71) Glad to hear! (The original patch improved and regressed the performance at the same time, but the regression is bigger. The fix removed the regression and surfaced the improvement.) > detail is as below > > ========================================================================================= > compiler/cpufreq_governor/kconfig/rootfs/runtime/size/tbox_group/test/testcase: > gcc-13/performance/x86_64-rhel-8.3/debian-12-x86_64-20240206.cgz/300s/512G/lkp-icl-2sp2/anon-cow-rand-hugetlb/vm-scalability > > commit: > 5a4d8944d6b1e ("cachestat: do not flush stats in recency check") > bd225530a4c71 ("mm/hugetlb_vmemmap: fix race with speculative PFN walkers") > 9a5b87b521401 <---- your v2 patch > > 5a4d8944d6b1e1aa bd225530a4c717714722c373144 9a5b87b5214018a2be217dc4648 > ---------------- --------------------------- --------------------------- > %stddev %change %stddev %change %stddev > \ | \ | \ > 4.271e+09 ± 10% +348.4% 1.915e+10 ± 6% -39.9% 2.567e+09 ± 20% cpuidle..time > 774593 ± 4% +1060.9% 8992186 ± 6% -17.2% 641254 cpuidle..usage > 555365 ± 8% +28.0% 710795 ± 2% -4.5% 530157 ± 5% numa-numastat.node0.local_node > 629633 ± 4% +23.0% 774346 ± 5% +0.6% 633264 ± 4% numa-numastat.node0.numa_hit > 255.76 ± 2% +31.1% 335.40 ± 3% -13.8% 220.53 ± 2% uptime.boot > 10305 ± 6% +144.3% 25171 ± 5% -17.1% 8543 ± 8% uptime.idle > 1.83 ± 58% +96200.0% 1765 ±155% +736.4% 15.33 ± 24% perf-c2c.DRAM.local > 33.00 ± 16% +39068.2% 12925 ±122% +95.5% 64.50 ± 49% perf-c2c.DRAM.remote > 21.33 ± 8% +2361.7% 525.17 ± 31% +271.1% 79.17 ± 52% perf-c2c.HITM.local > 9.17 ± 21% +3438.2% 324.33 ± 57% +270.9% 34.00 ± 60% perf-c2c.HITM.remote > 16.11 ± 7% +37.1 53.16 ± 2% -4.6 11.50 ± 19% mpstat.cpu.all.idle% > 0.34 ± 2% -0.1 0.22 +0.0 0.35 ± 3% mpstat.cpu.all.irq% > 0.03 ± 5% +0.0 0.04 ± 8% -0.0 0.02 mpstat.cpu.all.soft% > 10.58 ± 4% -9.5 1.03 ± 36% +0.1 10.71 ± 2% mpstat.cpu.all.sys% > 72.94 ± 2% -27.4 45.55 ± 3% +4.5 77.41 ± 2% mpstat.cpu.all.usr% > 6.00 ± 16% +230.6% 19.83 ± 5% +8.3% 6.50 ± 17% mpstat.max_utilization.seconds > 16.95 ± 7% +215.5% 53.48 ± 2% -26.2% 12.51 ± 16% vmstat.cpu.id > 72.33 ± 2% -37.4% 45.31 ± 3% +6.0% 76.65 ± 2% vmstat.cpu.us > 2.254e+08 -0.0% 2.254e+08 +14.7% 2.584e+08 vmstat.memory.free > 108.30 -43.3% 61.43 ± 2% +5.4% 114.12 ± 2% vmstat.procs.r > 2659 +162.6% 6982 ± 3% +3.6% 2753 ± 4% vmstat.system.cs > 136384 ± 4% -21.9% 106579 ± 7% +13.3% 154581 ± 3% vmstat.system.in > 203.41 ± 2% +39.2% 283.06 ± 4% -17.1% 168.71 ± 2% time.elapsed_time > 203.41 ± 2% +39.2% 283.06 ± 4% -17.1% 168.71 ± 2% time.elapsed_time.max > 148901 ± 6% -45.6% 81059 ± 4% -8.8% 135748 ± 8% time.involuntary_context_switches > 169.83 ± 23% +85.3% 314.67 ± 8% +7.9% 183.33 ± 7% time.major_page_faults > 10697 -43.4% 6050 ± 2% +5.6% 11294 ± 2% time.percent_of_cpu_this_job_got > 2740 ± 6% -86.7% 365.06 ± 43% -16.1% 2298 time.system_time > 19012 -11.9% 16746 -11.9% 16747 time.user_time > 14412 ± 5% +4432.0% 653187 -16.6% 12025 ± 3% time.voluntary_context_switches > 50095 ± 2% -31.5% 34325 ± 2% +18.6% 59408 vm-scalability.median > 8.25 ± 16% -3.4 4.84 ± 22% -6.6 1.65 ± 15% vm-scalability.median_stddev% > 6863720 -34.0% 4532485 +13.7% 7805408 vm-scalability.throughput > 203.41 ± 2% +39.2% 283.06 ± 4% -17.1% 168.71 ± 2% vm-scalability.time.elapsed_time > 203.41 ± 2% +39.2% 283.06 ± 4% -17.1% 168.71 ± 2% vm-scalability.time.elapsed_time.max > 148901 ± 6% -45.6% 81059 ± 4% -8.8% 135748 ± 8% vm-scalability.time.involuntary_context_switches > 10697 -43.4% 6050 ± 2% +5.6% 11294 ± 2% vm-scalability.time.percent_of_cpu_this_job_got > 2740 ± 6% -86.7% 365.06 ± 43% -16.1% 2298 vm-scalability.time.system_time > 19012 -11.9% 16746 -11.9% 16747 vm-scalability.time.user_time > 14412 ± 5% +4432.0% 653187 -16.6% 12025 ± 3% vm-scalability.time.voluntary_context_switches > 1.159e+09 +0.0% 1.159e+09 +1.6% 1.178e+09 vm-scalability.workload > 22900043 ± 4% +1.2% 23166356 ± 6% -16.7% 19076170 ± 5% numa-vmstat.node0.nr_free_pages > 42856 ± 43% +998.5% 470779 ± 51% +318.6% 179409 ±154% numa-vmstat.node0.nr_unevictable > 42856 ± 43% +998.5% 470779 ± 51% +318.6% 179409 ±154% numa-vmstat.node0.nr_zone_unevictable > 629160 ± 4% +22.9% 773391 ± 5% +0.5% 632570 ± 4% numa-vmstat.node0.numa_hit > 554892 ± 8% +27.9% 709841 ± 2% -4.6% 529463 ± 5% numa-vmstat.node0.numa_local > 27469 ± 14% +0.0% 27475 ± 41% -31.7% 18763 ± 13% numa-vmstat.node1.nr_active_anon > 767179 ± 2% -55.8% 339212 ± 72% -19.7% 616417 ± 43% numa-vmstat.node1.nr_file_pages > 10693349 ± 5% +46.3% 15639681 ± 7% +69.4% 18112002 ± 3% numa-vmstat.node1.nr_free_pages > 14210 ± 27% -65.0% 4973 ± 49% -34.7% 9280 ± 39% numa-vmstat.node1.nr_mapped > 724050 ± 2% -59.1% 296265 ± 82% -18.9% 587498 ± 47% numa-vmstat.node1.nr_unevictable > 27469 ± 14% +0.0% 27475 ± 41% -31.7% 18763 ± 13% numa-vmstat.node1.nr_zone_active_anon > 724050 ± 2% -59.1% 296265 ± 82% -18.9% 587498 ± 47% numa-vmstat.node1.nr_zone_unevictable > 120619 ± 11% +13.6% 137042 ± 27% -31.2% 82976 ± 7% meminfo.Active > 120472 ± 11% +13.6% 136895 ± 27% -31.2% 82826 ± 7% meminfo.Active(anon) > 70234807 +14.6% 80512468 +10.2% 77431344 meminfo.CommitLimit > 2.235e+08 +0.1% 2.237e+08 +15.1% 2.573e+08 meminfo.DirectMap1G > 44064 -22.8% 34027 ± 2% +20.7% 53164 ± 2% meminfo.HugePages_Surp > 44064 -22.8% 34027 ± 2% +20.7% 53164 ± 2% meminfo.HugePages_Total > 90243440 -22.8% 69688103 ± 2% +20.7% 1.089e+08 ± 2% meminfo.Hugetlb > 70163 ± 29% -42.6% 40293 ± 11% -21.9% 54789 ± 15% meminfo.Mapped > 1.334e+08 +15.5% 1.541e+08 +10.7% 1.477e+08 meminfo.MemAvailable > 1.344e+08 +15.4% 1.551e+08 +10.7% 1.488e+08 meminfo.MemFree > 2.307e+08 +0.0% 2.307e+08 +14.3% 2.637e+08 meminfo.MemTotal > 96309843 -21.5% 75639108 ± 2% +19.4% 1.15e+08 ± 2% meminfo.Memused > 259553 ± 2% -0.9% 257226 ± 15% -10.5% 232211 ± 4% meminfo.Shmem > 1.2e+08 -2.4% 1.172e+08 +13.3% 1.36e+08 meminfo.max_used_kB > 18884 ± 10% -7.2% 17519 ± 15% +37.6% 25983 ± 6% numa-meminfo.node0.HugePages_Surp > 18884 ± 10% -7.2% 17519 ± 15% +37.6% 25983 ± 6% numa-meminfo.node0.HugePages_Total > 91526744 ± 4% +1.2% 92620825 ± 6% -16.7% 76248423 ± 5% numa-meminfo.node0.MemFree > 40158207 ± 9% -2.7% 39064126 ± 15% +38.0% 55436528 ± 7% numa-meminfo.node0.MemUsed > 171426 ± 43% +998.5% 1883116 ± 51% +318.6% 717638 ±154% numa-meminfo.node0.Unevictable > 110091 ± 14% -0.1% 109981 ± 41% -31.7% 75226 ± 13% numa-meminfo.node1.Active > 110025 ± 14% -0.1% 109915 ± 41% -31.7% 75176 ± 13% numa-meminfo.node1.Active(anon) > 3068496 ± 2% -55.8% 1356754 ± 72% -19.6% 2466084 ± 43% numa-meminfo.node1.FilePages > 25218 ± 4% -34.7% 16475 ± 12% +7.9% 27213 ± 3% numa-meminfo.node1.HugePages_Surp > 25218 ± 4% -34.7% 16475 ± 12% +7.9% 27213 ± 3% numa-meminfo.node1.HugePages_Total > 55867 ± 27% -65.5% 19266 ± 50% -34.4% 36671 ± 38% numa-meminfo.node1.Mapped > 42795888 ± 5% +46.1% 62520130 ± 7% +69.3% 72441496 ± 3% numa-meminfo.node1.MemFree > 99028084 +0.0% 99028084 +33.4% 1.321e+08 numa-meminfo.node1.MemTotal > 56232195 ± 3% -35.1% 36507953 ± 12% +6.0% 59616707 ± 4% numa-meminfo.node1.MemUsed > 2896199 ± 2% -59.1% 1185064 ± 82% -18.9% 2349991 ± 47% numa-meminfo.node1.Unevictable > 507357 +0.0% 507357 +1.7% 516000 proc-vmstat.htlb_buddy_alloc_success > 29942 ± 10% +14.3% 34235 ± 27% -30.7% 20740 ± 7% proc-vmstat.nr_active_anon > 3324095 +15.7% 3847387 +10.9% 3686860 proc-vmstat.nr_dirty_background_threshold > 6656318 +15.7% 7704181 +10.9% 7382735 proc-vmstat.nr_dirty_threshold > 33559092 +15.6% 38798108 +10.9% 37209133 proc-vmstat.nr_free_pages > 197697 ± 2% -2.5% 192661 +1.0% 199623 proc-vmstat.nr_inactive_anon > 17939 ± 28% -42.5% 10307 ± 11% -22.4% 13927 ± 14% proc-vmstat.nr_mapped > 2691 -7.1% 2501 +2.9% 2769 proc-vmstat.nr_page_table_pages > 64848 ± 2% -0.7% 64386 ± 15% -10.6% 57987 ± 4% proc-vmstat.nr_shmem > 29942 ± 10% +14.3% 34235 ± 27% -30.7% 20740 ± 7% proc-vmstat.nr_zone_active_anon > 197697 ± 2% -2.5% 192661 +1.0% 199623 proc-vmstat.nr_zone_inactive_anon > 1403095 +9.3% 1534152 ± 2% -3.2% 1358244 proc-vmstat.numa_hit > 1267544 +10.6% 1401482 ± 2% -3.4% 1224210 proc-vmstat.numa_local > 2.608e+08 +0.1% 2.609e+08 +1.7% 2.651e+08 proc-vmstat.pgalloc_normal > 1259957 +13.4% 1428284 ± 2% -6.5% 1178198 proc-vmstat.pgfault > 2.591e+08 +0.3% 2.6e+08 +2.3% 2.649e+08 proc-vmstat.pgfree > 36883 ± 3% +18.5% 43709 ± 5% -12.2% 32371 ± 3% proc-vmstat.pgreuse > 1.88 ± 16% -0.6 1.33 ±100% +0.9 2.80 ± 11% perf-profile.calltrace.cycles-pp.nrand48_r > 16.19 ± 85% +28.6 44.75 ± 95% -11.4 4.78 ±218% perf-profile.calltrace.cycles-pp.hugetlb_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault > 16.20 ± 85% +28.6 44.78 ± 95% -11.4 4.78 ±218% perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.do_access > 16.22 ± 85% +28.6 44.82 ± 95% -11.4 4.79 ±218% perf-profile.calltrace.cycles-pp.exc_page_fault.asm_exc_page_fault.do_access > 16.22 ± 85% +28.6 44.82 ± 95% -11.4 4.79 ±218% perf-profile.calltrace.cycles-pp.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.do_access > 16.24 ± 85% +28.8 45.01 ± 95% -11.4 4.80 ±218% perf-profile.calltrace.cycles-pp.asm_exc_page_fault.do_access > 12.42 ± 84% +29.5 41.89 ± 95% -8.8 3.65 ±223% perf-profile.calltrace.cycles-pp.copy_mc_enhanced_fast_string.copy_subpage.copy_user_large_folio.hugetlb_wp.hugetlb_fault > 12.52 ± 84% +29.6 42.08 ± 95% -8.8 3.68 ±223% perf-profile.calltrace.cycles-pp.copy_subpage.copy_user_large_folio.hugetlb_wp.hugetlb_fault.handle_mm_fault > 12.53 ± 84% +29.7 42.23 ± 95% -8.9 3.68 ±223% perf-profile.calltrace.cycles-pp.copy_user_large_folio.hugetlb_wp.hugetlb_fault.handle_mm_fault.do_user_addr_fault > 12.80 ± 84% +30.9 43.65 ± 95% -9.0 3.76 ±223% perf-profile.calltrace.cycles-pp.hugetlb_wp.hugetlb_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault > 2.50 ± 17% -0.7 1.78 ±100% +1.2 3.73 ± 11% perf-profile.children.cycles-pp.nrand48_r > 16.24 ± 85% +28.6 44.87 ± 95% -11.4 4.79 ±218% perf-profile.children.cycles-pp.do_user_addr_fault > 16.24 ± 85% +28.6 44.87 ± 95% -11.4 4.79 ±218% perf-profile.children.cycles-pp.exc_page_fault > 16.20 ± 85% +28.7 44.86 ± 95% -11.4 4.78 ±218% perf-profile.children.cycles-pp.hugetlb_fault > 16.22 ± 85% +28.7 44.94 ± 95% -11.4 4.79 ±218% perf-profile.children.cycles-pp.handle_mm_fault > 16.26 ± 85% +28.8 45.06 ± 95% -11.5 4.80 ±218% perf-profile.children.cycles-pp.asm_exc_page_fault > 12.51 ± 84% +29.5 42.01 ± 95% -8.8 3.75 ±218% perf-profile.children.cycles-pp.copy_mc_enhanced_fast_string > 12.52 ± 84% +29.6 42.11 ± 95% -8.8 3.75 ±218% perf-profile.children.cycles-pp.copy_subpage > 12.53 ± 84% +29.7 42.25 ± 95% -8.8 3.76 ±218% perf-profile.children.cycles-pp.copy_user_large_folio > 12.80 ± 84% +30.9 43.65 ± 95% -9.0 3.83 ±218% perf-profile.children.cycles-pp.hugetlb_wp > 2.25 ± 17% -0.7 1.59 ±100% +1.1 3.36 ± 11% perf-profile.self.cycles-pp.nrand48_r > 1.74 ± 21% -0.5 1.25 ± 92% +1.2 2.94 ± 13% perf-profile.self.cycles-pp.do_access > 0.27 ± 17% -0.1 0.19 ±100% +0.1 0.40 ± 11% perf-profile.self.cycles-pp.lrand48_r > 12.41 ± 84% +29.4 41.80 ± 95% -8.7 3.72 ±218% perf-profile.self.cycles-pp.copy_mc_enhanced_fast_string > 350208 ± 16% -2.7% 340891 ± 36% -47.2% 184918 ± 9% sched_debug.cfs_rq:/.avg_vruntime.stddev > 16833 ±149% -100.0% 3.19 ±100% -100.0% 0.58 ±179% sched_debug.cfs_rq:/.left_deadline.avg > 2154658 ±149% -100.0% 317.15 ± 93% -100.0% 74.40 ±179% sched_debug.cfs_rq:/.left_deadline.max > 189702 ±149% -100.0% 29.47 ± 94% -100.0% 6.55 ±179% sched_debug.cfs_rq:/.left_deadline.stddev > 16833 ±149% -100.0% 3.05 ±102% -100.0% 0.58 ±179% sched_debug.cfs_rq:/.left_vruntime.avg > 2154613 ±149% -100.0% 298.70 ± 95% -100.0% 74.06 ±179% sched_debug.cfs_rq:/.left_vruntime.max > 189698 ±149% -100.0% 27.96 ± 96% -100.0% 6.52 ±179% sched_debug.cfs_rq:/.left_vruntime.stddev > 350208 ± 16% -2.7% 340891 ± 36% -47.2% 184918 ± 9% sched_debug.cfs_rq:/.min_vruntime.stddev > 52.88 ± 14% -19.5% 42.56 ± 39% +22.8% 64.94 ± 9% sched_debug.cfs_rq:/.removed.load_avg.stddev > 16833 ±149% -100.0% 3.05 ±102% -100.0% 0.58 ±179% sched_debug.cfs_rq:/.right_vruntime.avg > 2154613 ±149% -100.0% 298.70 ± 95% -100.0% 74.11 ±179% sched_debug.cfs_rq:/.right_vruntime.max > 189698 ±149% -100.0% 27.96 ± 96% -100.0% 6.53 ±179% sched_debug.cfs_rq:/.right_vruntime.stddev > 1588 ± 9% -31.2% 1093 ± 18% -20.0% 1270 ± 16% sched_debug.cfs_rq:/.runnable_avg.max > 676.36 ± 7% -94.8% 35.08 ± 42% -2.7% 657.82 ± 3% sched_debug.cfs_rq:/.util_est.avg > 1339 ± 8% -74.5% 341.42 ± 24% -22.6% 1037 ± 23% sched_debug.cfs_rq:/.util_est.max > 152.67 ± 35% -72.3% 42.35 ± 21% -14.9% 129.89 ± 33% sched_debug.cfs_rq:/.util_est.stddev > 1116839 ± 7% -7.1% 1037321 ± 4% +22.9% 1372316 ± 11% sched_debug.cpu.avg_idle.max > 126915 ± 10% +31.6% 166966 ± 6% -12.2% 111446 ± 2% sched_debug.cpu.clock.avg > 126930 ± 10% +31.6% 166977 ± 6% -12.2% 111459 ± 2% sched_debug.cpu.clock.max > 126899 ± 10% +31.6% 166949 ± 6% -12.2% 111428 ± 2% sched_debug.cpu.clock.min > 126491 ± 10% +31.7% 166537 ± 6% -12.2% 111078 ± 2% sched_debug.cpu.clock_task.avg > 126683 ± 10% +31.6% 166730 ± 6% -12.2% 111237 ± 2% sched_debug.cpu.clock_task.max > 117365 ± 11% +33.6% 156775 ± 6% -13.0% 102099 ± 2% sched_debug.cpu.clock_task.min > 2826 ± 10% +178.1% 7858 ± 8% -10.3% 2534 ± 6% sched_debug.cpu.nr_switches.avg > 755.38 ± 15% +423.8% 3956 ± 14% -15.2% 640.33 ± 3% sched_debug.cpu.nr_switches.min > 126900 ± 10% +31.6% 166954 ± 6% -12.2% 111432 ± 2% sched_debug.cpu_clk > 125667 ± 10% +31.9% 165721 ± 6% -12.3% 110200 ± 2% sched_debug.ktime > 0.54 ±141% -99.9% 0.00 ±132% -99.9% 0.00 ±114% sched_debug.rt_rq:.rt_time.avg > 69.73 ±141% -99.9% 0.06 ±132% -99.9% 0.07 ±114% sched_debug.rt_rq:.rt_time.max > 6.14 ±141% -99.9% 0.01 ±132% -99.9% 0.01 ±114% sched_debug.rt_rq:.rt_time.stddev > 127860 ± 10% +31.3% 167917 ± 6% -12.1% 112402 ± 2% sched_debug.sched_clk > 15.99 +363.6% 74.14 ± 6% +10.1% 17.61 perf-stat.i.MPKI > 1.467e+10 ± 2% -32.0% 9.975e+09 ± 3% +21.3% 1.779e+10 ± 2% perf-stat.i.branch-instructions > 0.10 ± 5% +0.6 0.68 ± 5% +0.0 0.11 ± 4% perf-stat.i.branch-miss-rate% > 10870114 ± 3% -26.4% 8001551 ± 3% +15.7% 12580898 ± 2% perf-stat.i.branch-misses > 97.11 -20.0 77.11 -0.0 97.10 perf-stat.i.cache-miss-rate% > 8.118e+08 ± 2% -32.5% 5.482e+08 ± 3% +23.1% 9.992e+08 ± 2% perf-stat.i.cache-misses > 8.328e+08 ± 2% -28.4% 5.963e+08 ± 3% +22.8% 1.023e+09 ± 2% perf-stat.i.cache-references > 2601 ± 2% +172.3% 7083 ± 3% +2.5% 2665 ± 5% perf-stat.i.context-switches > 5.10 +39.5% 7.11 ± 9% -9.2% 4.62 perf-stat.i.cpi > 2.826e+11 -44.1% 1.58e+11 ± 2% +5.7% 2.987e+11 ± 2% perf-stat.i.cpu-cycles > 216.56 +42.4% 308.33 ± 6% +2.2% 221.23 perf-stat.i.cpu-migrations > 358.79 -0.3% 357.70 ± 21% -14.1% 308.23 perf-stat.i.cycles-between-cache-misses > 6.286e+10 ± 2% -31.7% 4.293e+10 ± 3% +21.3% 7.626e+10 ± 2% perf-stat.i.instructions > 0.24 +39.9% 0.33 ± 4% +13.6% 0.27 perf-stat.i.ipc > 5844 -16.9% 4856 ± 2% +12.5% 6577 perf-stat.i.minor-faults > 5846 -16.9% 4857 ± 2% +12.5% 6578 perf-stat.i.page-faults > 13.00 -2.2% 12.72 +1.2% 13.15 perf-stat.overall.MPKI > 0.07 +0.0 0.08 -0.0 0.07 perf-stat.overall.branch-miss-rate% > 97.44 -5.3 92.09 +0.2 97.66 perf-stat.overall.cache-miss-rate% > 4.51 -18.4% 3.68 -13.0% 3.92 perf-stat.overall.cpi > 346.76 -16.6% 289.11 -14.0% 298.06 perf-stat.overall.cycles-between-cache-misses > 0.22 +22.6% 0.27 +15.0% 0.26 perf-stat.overall.ipc > 10906 -3.4% 10541 -1.1% 10784 perf-stat.overall.path-length > 1.445e+10 ± 2% -30.7% 1.001e+10 ± 3% +21.2% 1.752e+10 ± 2% perf-stat.ps.branch-instructions > 10469697 ± 3% -23.5% 8005730 ± 3% +18.3% 12387061 ± 2% perf-stat.ps.branch-misses > 8.045e+08 ± 2% -31.9% 5.478e+08 ± 3% +22.7% 9.874e+08 ± 2% perf-stat.ps.cache-misses > 8.257e+08 ± 2% -27.9% 5.95e+08 ± 3% +22.5% 1.011e+09 ± 2% perf-stat.ps.cache-references > 2584 ± 2% +169.3% 6958 ± 3% +2.7% 2654 ± 4% perf-stat.ps.context-switches > 2.789e+11 -43.2% 1.583e+11 ± 2% +5.5% 2.943e+11 ± 2% perf-stat.ps.cpu-cycles > 214.69 +41.8% 304.37 ± 6% +2.2% 219.46 perf-stat.ps.cpu-migrations > 6.19e+10 ± 2% -30.4% 4.309e+10 ± 3% +21.3% 7.507e+10 ± 2% perf-stat.ps.instructions > 5849 -18.0% 4799 ± 2% +12.3% 6568 ± 2% perf-stat.ps.minor-faults > 5851 -18.0% 4800 ± 2% +12.3% 6570 ± 2% perf-stat.ps.page-faults > 1.264e+13 -3.4% 1.222e+13 +0.5% 1.27e+13 perf-stat.total.instructions