On Mon, Jun 11, 2018 at 05:15:22PM +0200, Laurent Dufour wrote: Hi Laurent, For perf date tested on Intel 4s Skylake platform, here attached the compare result between base and head commit in attachment, which include the perf-profile comparision information. And also attached some perf-profile.json captured from test result for page_fault2 and page_fault3 for checking the regression, thanks. Best regards, Haiyan Song > Hi Haiyan, > > I don't have access to the same hardware you ran the test on, but I give a try > to those test on a Power8 system (2 sockets, 5 cores/s, 8 threads/c, 80 CPUs 32G). > I run each will-it-scale test 10 times and compute the average. > > test THP enabled 4.17.0-rc4-mm1 spf delta > page_fault3_threads 2697.7 2683.5 -0.53% > page_fault2_threads 170660.6 169574.1 -0.64% > context_switch1_threads 6915269.2 6877507.3 -0.55% > context_switch1_processes 6478076.2 6529493.5 0.79% > rk1 243391.2 238527.5 -2.00% > > Test were launched with the arguments '-t 80 -s 5', only the average report is > taken in account. Note that page size is 64K by default on ppc64. > > It would be nice if you could capture some perf data to figure out why the > page_fault2/3 are showing such a performance regression. > > Thanks, > Laurent. > > On 11/06/2018 09:49, Song, HaiyanX wrote: > > Hi Laurent, > > > > Regression test for v11 patch serials have been run, some regression is found by LKP-tools (linux kernel performance) > > tested on Intel 4s skylake platform. This time only test the cases which have been run and found regressions on > > V9 patch serials. > > > > The regression result is sorted by the metric will-it-scale.per_thread_ops. > > branch: Laurent-Dufour/Speculative-page-faults/20180520-045126 > > commit id: > > head commit : a7a8993bfe3ccb54ad468b9f1799649e4ad1ff12 > > base commit : ba98a1cdad71d259a194461b3a61471b49b14df1 > > Benchmark: will-it-scale > > Download link: https://github.com/antonblanchard/will-it-scale/tree/master > > > > Metrics: > > will-it-scale.per_process_ops=processes/nr_cpu > > will-it-scale.per_thread_ops=threads/nr_cpu > > test box: lkp-skl-4sp1(nr_cpu=192,memory=768G) > > THP: enable / disable > > nr_task:100% > > > > 1. Regressions: > > > > a). Enable THP > > testcase base change head metric > > page_fault3/enable THP 10519 -20.5% 8368 will-it-scale.per_thread_ops > > page_fault2/enalbe THP 8281 -18.8% 6728 will-it-scale.per_thread_ops > > brk1/eanble THP 998475 -2.2% 976893 will-it-scale.per_process_ops > > context_switch1/enable THP 223910 -1.3% 220930 will-it-scale.per_process_ops > > context_switch1/enable THP 233722 -1.0% 231288 will-it-scale.per_thread_ops > > > > b). Disable THP > > page_fault3/disable THP 10856 -23.1% 8344 will-it-scale.per_thread_ops > > page_fault2/disable THP 8147 -18.8% 6613 will-it-scale.per_thread_ops > > brk1/disable THP 957 -7.9% 881 will-it-scale.per_thread_ops > > context_switch1/disable THP 237006 -2.2% 231907 will-it-scale.per_thread_ops > > brk1/disable THP 997317 -2.0% 977778 will-it-scale.per_process_ops > > page_fault3/disable THP 467454 -1.8% 459251 will-it-scale.per_process_ops > > context_switch1/disable THP 224431 -1.3% 221567 will-it-scale.per_process_ops > > > > Notes: for the above values of test result, the higher is better. > > > > 2. Improvement: not found improvement based on the selected test cases. > > > > > > Best regards > > Haiyan Song > > ________________________________________ > > From: owner-linux-mm@xxxxxxxxx [owner-linux-mm@xxxxxxxxx] on behalf of Laurent Dufour [ldufour@xxxxxxxxxxxxxxxxxx] > > Sent: Monday, May 28, 2018 4:54 PM > > To: Song, HaiyanX > > Cc: akpm@xxxxxxxxxxxxxxxxxxxx; mhocko@xxxxxxxxxx; peterz@xxxxxxxxxxxxx; kirill@xxxxxxxxxxxxx; ak@xxxxxxxxxxxxxxx; dave@xxxxxxxxxxxx; jack@xxxxxxx; Matthew Wilcox; khandual@xxxxxxxxxxxxxxxxxx; aneesh.kumar@xxxxxxxxxxxxxxxxxx; benh@xxxxxxxxxxxxxxxxxxx; mpe@xxxxxxxxxxxxxx; paulus@xxxxxxxxx; Thomas Gleixner; Ingo Molnar; hpa@xxxxxxxxx; Will Deacon; Sergey Senozhatsky; sergey.senozhatsky.work@xxxxxxxxx; Andrea Arcangeli; Alexei Starovoitov; Wang, Kemi; Daniel Jordan; David Rientjes; Jerome Glisse; Ganesh Mahendran; Minchan Kim; Punit Agrawal; vinayak menon; Yang Shi; linux-kernel@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx; haren@xxxxxxxxxxxxxxxxxx; npiggin@xxxxxxxxx; bsingharora@xxxxxxxxx; paulmck@xxxxxxxxxxxxxxxxxx; Tim Chen; linuxppc-dev@xxxxxxxxxxxxxxxx; x86@xxxxxxxxxx > > Subject: Re: [PATCH v11 00/26] Speculative page faults > > > > On 28/05/2018 10:22, Haiyan Song wrote: > >> Hi Laurent, > >> > >> Yes, these tests are done on V9 patch. > > > > Do you plan to give this V11 a run ? > > > >> > >> > >> Best regards, > >> Haiyan Song > >> > >> On Mon, May 28, 2018 at 09:51:34AM +0200, Laurent Dufour wrote: > >>> On 28/05/2018 07:23, Song, HaiyanX wrote: > >>>> > >>>> Some regression and improvements is found by LKP-tools(linux kernel performance) on V9 patch series > >>>> tested on Intel 4s Skylake platform. > >>> > >>> Hi, > >>> > >>> Thanks for reporting this benchmark results, but you mentioned the "V9 patch > >>> series" while responding to the v11 header series... > >>> Were these tests done on v9 or v11 ? > >>> > >>> Cheers, > >>> Laurent. > >>> > >>>> > >>>> The regression result is sorted by the metric will-it-scale.per_thread_ops. > >>>> Branch: Laurent-Dufour/Speculative-page-faults/20180316-151833 (V9 patch series) > >>>> Commit id: > >>>> base commit: d55f34411b1b126429a823d06c3124c16283231f > >>>> head commit: 0355322b3577eeab7669066df42c550a56801110 > >>>> Benchmark suite: will-it-scale > >>>> Download link: > >>>> https://github.com/antonblanchard/will-it-scale/tree/master/tests > >>>> Metrics: > >>>> will-it-scale.per_process_ops=processes/nr_cpu > >>>> will-it-scale.per_thread_ops=threads/nr_cpu > >>>> test box: lkp-skl-4sp1(nr_cpu=192,memory=768G) > >>>> THP: enable / disable > >>>> nr_task: 100% > >>>> > >>>> 1. Regressions: > >>>> a) THP enabled: > >>>> testcase base change head metric > >>>> page_fault3/ enable THP 10092 -17.5% 8323 will-it-scale.per_thread_ops > >>>> page_fault2/ enable THP 8300 -17.2% 6869 will-it-scale.per_thread_ops > >>>> brk1/ enable THP 957.67 -7.6% 885 will-it-scale.per_thread_ops > >>>> page_fault3/ enable THP 172821 -5.3% 163692 will-it-scale.per_process_ops > >>>> signal1/ enable THP 9125 -3.2% 8834 will-it-scale.per_process_ops > >>>> > >>>> b) THP disabled: > >>>> testcase base change head metric > >>>> page_fault3/ disable THP 10107 -19.1% 8180 will-it-scale.per_thread_ops > >>>> page_fault2/ disable THP 8432 -17.8% 6931 will-it-scale.per_thread_ops > >>>> context_switch1/ disable THP 215389 -6.8% 200776 will-it-scale.per_thread_ops > >>>> brk1/ disable THP 939.67 -6.6% 877.33 will-it-scale.per_thread_ops > >>>> page_fault3/ disable THP 173145 -4.7% 165064 will-it-scale.per_process_ops > >>>> signal1/ disable THP 9162 -3.9% 8802 will-it-scale.per_process_ops > >>>> > >>>> 2. Improvements: > >>>> a) THP enabled: > >>>> testcase base change head metric > >>>> malloc1/ enable THP 66.33 +469.8% 383.67 will-it-scale.per_thread_ops > >>>> writeseek3/ enable THP 2531 +4.5% 2646 will-it-scale.per_thread_ops > >>>> signal1/ enable THP 989.33 +2.8% 1016 will-it-scale.per_thread_ops > >>>> > >>>> b) THP disabled: > >>>> testcase base change head metric > >>>> malloc1/ disable THP 90.33 +417.3% 467.33 will-it-scale.per_thread_ops > >>>> read2/ disable THP 58934 +39.2% 82060 will-it-scale.per_thread_ops > >>>> page_fault1/ disable THP 8607 +36.4% 11736 will-it-scale.per_thread_ops > >>>> read1/ disable THP 314063 +12.7% 353934 will-it-scale.per_thread_ops > >>>> writeseek3/ disable THP 2452 +12.5% 2759 will-it-scale.per_thread_ops > >>>> signal1/ disable THP 971.33 +5.5% 1024 will-it-scale.per_thread_ops > >>>> > >>>> Notes: for above values in column "change", the higher value means that the related testcase result > >>>> on head commit is better than that on base commit for this benchmark. > >>>> > >>>> > >>>> Best regards > >>>> Haiyan Song > >>>> > >>>> ________________________________________ > >>>> From: owner-linux-mm@xxxxxxxxx [owner-linux-mm@xxxxxxxxx] on behalf of Laurent Dufour [ldufour@xxxxxxxxxxxxxxxxxx] > >>>> Sent: Thursday, May 17, 2018 7:06 PM > >>>> To: akpm@xxxxxxxxxxxxxxxxxxxx; mhocko@xxxxxxxxxx; peterz@xxxxxxxxxxxxx; kirill@xxxxxxxxxxxxx; ak@xxxxxxxxxxxxxxx; dave@xxxxxxxxxxxx; jack@xxxxxxx; Matthew Wilcox; khandual@xxxxxxxxxxxxxxxxxx; aneesh.kumar@xxxxxxxxxxxxxxxxxx; benh@xxxxxxxxxxxxxxxxxxx; mpe@xxxxxxxxxxxxxx; paulus@xxxxxxxxx; Thomas Gleixner; Ingo Molnar; hpa@xxxxxxxxx; Will Deacon; Sergey Senozhatsky; sergey.senozhatsky.work@xxxxxxxxx; Andrea Arcangeli; Alexei Starovoitov; Wang, Kemi; Daniel Jordan; David Rientjes; Jerome Glisse; Ganesh Mahendran; Minchan Kim; Punit Agrawal; vinayak menon; Yang Shi > >>>> Cc: linux-kernel@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx; haren@xxxxxxxxxxxxxxxxxx; npiggin@xxxxxxxxx; bsingharora@xxxxxxxxx; paulmck@xxxxxxxxxxxxxxxxxx; Tim Chen; linuxppc-dev@xxxxxxxxxxxxxxxx; x86@xxxxxxxxxx > >>>> Subject: [PATCH v11 00/26] Speculative page faults > >>>> > >>>> This is a port on kernel 4.17 of the work done by Peter Zijlstra to handle > >>>> page fault without holding the mm semaphore [1]. > >>>> > >>>> The idea is to try to handle user space page faults without holding the > >>>> mmap_sem. This should allow better concurrency for massively threaded > >>>> process since the page fault handler will not wait for other threads memory > >>>> layout change to be done, assuming that this change is done in another part > >>>> of the process's memory space. This type page fault is named speculative > >>>> page fault. If the speculative page fault fails because of a concurrency is > >>>> detected or because underlying PMD or PTE tables are not yet allocating, it > >>>> is failing its processing and a classic page fault is then tried. > >>>> > >>>> The speculative page fault (SPF) has to look for the VMA matching the fault > >>>> address without holding the mmap_sem, this is done by introducing a rwlock > >>>> which protects the access to the mm_rb tree. Previously this was done using > >>>> SRCU but it was introducing a lot of scheduling to process the VMA's > >>>> freeing operation which was hitting the performance by 20% as reported by > >>>> Kemi Wang [2]. Using a rwlock to protect access to the mm_rb tree is > >>>> limiting the locking contention to these operations which are expected to > >>>> be in a O(log n) order. In addition to ensure that the VMA is not freed in > >>>> our back a reference count is added and 2 services (get_vma() and > >>>> put_vma()) are introduced to handle the reference count. Once a VMA is > >>>> fetched from the RB tree using get_vma(), it must be later freed using > >>>> put_vma(). I can't see anymore the overhead I got while will-it-scale > >>>> benchmark anymore. > >>>> > >>>> The VMA's attributes checked during the speculative page fault processing > >>>> have to be protected against parallel changes. This is done by using a per > >>>> VMA sequence lock. This sequence lock allows the speculative page fault > >>>> handler to fast check for parallel changes in progress and to abort the > >>>> speculative page fault in that case. > >>>> > >>>> Once the VMA has been found, the speculative page fault handler would check > >>>> for the VMA's attributes to verify that the page fault has to be handled > >>>> correctly or not. Thus, the VMA is protected through a sequence lock which > >>>> allows fast detection of concurrent VMA changes. If such a change is > >>>> detected, the speculative page fault is aborted and a *classic* page fault > >>>> is tried. VMA sequence lockings are added when VMA attributes which are > >>>> checked during the page fault are modified. > >>>> > >>>> When the PTE is fetched, the VMA is checked to see if it has been changed, > >>>> so once the page table is locked, the VMA is valid, so any other changes > >>>> leading to touching this PTE will need to lock the page table, so no > >>>> parallel change is possible at this time. > >>>> > >>>> The locking of the PTE is done with interrupts disabled, this allows > >>>> checking for the PMD to ensure that there is not an ongoing collapsing > >>>> operation. Since khugepaged is firstly set the PMD to pmd_none and then is > >>>> waiting for the other CPU to have caught the IPI interrupt, if the pmd is > >>>> valid at the time the PTE is locked, we have the guarantee that the > >>>> collapsing operation will have to wait on the PTE lock to move forward. > >>>> This allows the SPF handler to map the PTE safely. If the PMD value is > >>>> different from the one recorded at the beginning of the SPF operation, the > >>>> classic page fault handler will be called to handle the operation while > >>>> holding the mmap_sem. As the PTE lock is done with the interrupts disabled, > >>>> the lock is done using spin_trylock() to avoid dead lock when handling a > >>>> page fault while a TLB invalidate is requested by another CPU holding the > >>>> PTE. > >>>> > >>>> In pseudo code, this could be seen as: > >>>> speculative_page_fault() > >>>> { > >>>> vma = get_vma() > >>>> check vma sequence count > >>>> check vma's support > >>>> disable interrupt > >>>> check pgd,p4d,...,pte > >>>> save pmd and pte in vmf > >>>> save vma sequence counter in vmf > >>>> enable interrupt > >>>> check vma sequence count > >>>> handle_pte_fault(vma) > >>>> .. > >>>> page = alloc_page() > >>>> pte_map_lock() > >>>> disable interrupt > >>>> abort if sequence counter has changed > >>>> abort if pmd or pte has changed > >>>> pte map and lock > >>>> enable interrupt > >>>> if abort > >>>> free page > >>>> abort > >>>> ... > >>>> } > >>>> > >>>> arch_fault_handler() > >>>> { > >>>> if (speculative_page_fault(&vma)) > >>>> goto done > >>>> again: > >>>> lock(mmap_sem) > >>>> vma = find_vma(); > >>>> handle_pte_fault(vma); > >>>> if retry > >>>> unlock(mmap_sem) > >>>> goto again; > >>>> done: > >>>> handle fault error > >>>> } > >>>> > >>>> Support for THP is not done because when checking for the PMD, we can be > >>>> confused by an in progress collapsing operation done by khugepaged. The > >>>> issue is that pmd_none() could be true either if the PMD is not already > >>>> populated or if the underlying PTE are in the way to be collapsed. So we > >>>> cannot safely allocate a PMD if pmd_none() is true. > >>>> > >>>> This series add a new software performance event named 'speculative-faults' > >>>> or 'spf'. It counts the number of successful page fault event handled > >>>> speculatively. When recording 'faults,spf' events, the faults one is > >>>> counting the total number of page fault events while 'spf' is only counting > >>>> the part of the faults processed speculatively. > >>>> > >>>> There are some trace events introduced by this series. They allow > >>>> identifying why the page faults were not processed speculatively. This > >>>> doesn't take in account the faults generated by a monothreaded process > >>>> which directly processed while holding the mmap_sem. This trace events are > >>>> grouped in a system named 'pagefault', they are: > >>>> - pagefault:spf_vma_changed : if the VMA has been changed in our back > >>>> - pagefault:spf_vma_noanon : the vma->anon_vma field was not yet set. > >>>> - pagefault:spf_vma_notsup : the VMA's type is not supported > >>>> - pagefault:spf_vma_access : the VMA's access right are not respected > >>>> - pagefault:spf_pmd_changed : the upper PMD pointer has changed in our > >>>> back. > >>>> > >>>> To record all the related events, the easier is to run perf with the > >>>> following arguments : > >>>> $ perf stat -e 'faults,spf,pagefault:*' <command> > >>>> > >>>> There is also a dedicated vmstat counter showing the number of successful > >>>> page fault handled speculatively. I can be seen this way: > >>>> $ grep speculative_pgfault /proc/vmstat > >>>> > >>>> This series builds on top of v4.16-mmotm-2018-04-13-17-28 and is functional > >>>> on x86, PowerPC and arm64. > >>>> > >>>> --------------------- > >>>> Real Workload results > >>>> > >>>> As mentioned in previous email, we did non official runs using a "popular > >>>> in memory multithreaded database product" on 176 cores SMT8 Power system > >>>> which showed a 30% improvements in the number of transaction processed per > >>>> second. This run has been done on the v6 series, but changes introduced in > >>>> this new version should not impact the performance boost seen. > >>>> > >>>> Here are the perf data captured during 2 of these runs on top of the v8 > >>>> series: > >>>> vanilla spf > >>>> faults 89.418 101.364 +13% > >>>> spf n/a 97.989 > >>>> > >>>> With the SPF kernel, most of the page fault were processed in a speculative > >>>> way. > >>>> > >>>> Ganesh Mahendran had backported the series on top of a 4.9 kernel and gave > >>>> it a try on an android device. He reported that the application launch time > >>>> was improved in average by 6%, and for large applications (~100 threads) by > >>>> 20%. > >>>> > >>>> Here are the launch time Ganesh mesured on Android 8.0 on top of a Qcom > >>>> MSM845 (8 cores) with 6GB (the less is better): > >>>> > >>>> Application 4.9 4.9+spf delta > >>>> com.tencent.mm 416 389 -7% > >>>> com.eg.android.AlipayGphone 1135 986 -13% > >>>> com.tencent.mtt 455 454 0% > >>>> com.qqgame.hlddz 1497 1409 -6% > >>>> com.autonavi.minimap 711 701 -1% > >>>> com.tencent.tmgp.sgame 788 748 -5% > >>>> com.immomo.momo 501 487 -3% > >>>> com.tencent.peng 2145 2112 -2% > >>>> com.smile.gifmaker 491 461 -6% > >>>> com.baidu.BaiduMap 479 366 -23% > >>>> com.taobao.taobao 1341 1198 -11% > >>>> com.baidu.searchbox 333 314 -6% > >>>> com.tencent.mobileqq 394 384 -3% > >>>> com.sina.weibo 907 906 0% > >>>> com.youku.phone 816 731 -11% > >>>> com.happyelements.AndroidAnimal.qq 763 717 -6% > >>>> com.UCMobile 415 411 -1% > >>>> com.tencent.tmgp.ak 1464 1431 -2% > >>>> com.tencent.qqmusic 336 329 -2% > >>>> com.sankuai.meituan 1661 1302 -22% > >>>> com.netease.cloudmusic 1193 1200 1% > >>>> air.tv.douyu.android 4257 4152 -2% > >>>> > >>>> ------------------ > >>>> Benchmarks results > >>>> > >>>> Base kernel is v4.17.0-rc4-mm1 > >>>> SPF is BASE + this series > >>>> > >>>> Kernbench: > >>>> ---------- > >>>> Here are the results on a 16 CPUs X86 guest using kernbench on a 4.15 > >>>> kernel (kernel is build 5 times): > >>>> > >>>> Average Half load -j 8 > >>>> Run (std deviation) > >>>> BASE SPF > >>>> Elapsed Time 1448.65 (5.72312) 1455.84 (4.84951) 0.50% > >>>> User Time 10135.4 (30.3699) 10148.8 (31.1252) 0.13% > >>>> System Time 900.47 (2.81131) 923.28 (7.52779) 2.53% > >>>> Percent CPU 761.4 (1.14018) 760.2 (0.447214) -0.16% > >>>> Context Switches 85380 (3419.52) 84748 (1904.44) -0.74% > >>>> Sleeps 105064 (1240.96) 105074 (337.612) 0.01% > >>>> > >>>> Average Optimal load -j 16 > >>>> Run (std deviation) > >>>> BASE SPF > >>>> Elapsed Time 920.528 (10.1212) 927.404 (8.91789) 0.75% > >>>> User Time 11064.8 (981.142) 11085 (990.897) 0.18% > >>>> System Time 979.904 (84.0615) 1001.14 (82.5523) 2.17% > >>>> Percent CPU 1089.5 (345.894) 1086.1 (343.545) -0.31% > >>>> Context Switches 159488 (78156.4) 158223 (77472.1) -0.79% > >>>> Sleeps 110566 (5877.49) 110388 (5617.75) -0.16% > >>>> > >>>> > >>>> During a run on the SPF, perf events were captured: > >>>> Performance counter stats for '../kernbench -M': > >>>> 526743764 faults > >>>> 210 spf > >>>> 3 pagefault:spf_vma_changed > >>>> 0 pagefault:spf_vma_noanon > >>>> 2278 pagefault:spf_vma_notsup > >>>> 0 pagefault:spf_vma_access > >>>> 0 pagefault:spf_pmd_changed > >>>> > >>>> Very few speculative page faults were recorded as most of the processes > >>>> involved are monothreaded (sounds that on this architecture some threads > >>>> were created during the kernel build processing). > >>>> > >>>> Here are the kerbench results on a 80 CPUs Power8 system: > >>>> > >>>> Average Half load -j 40 > >>>> Run (std deviation) > >>>> BASE SPF > >>>> Elapsed Time 117.152 (0.774642) 117.166 (0.476057) 0.01% > >>>> User Time 4478.52 (24.7688) 4479.76 (9.08555) 0.03% > >>>> System Time 131.104 (0.720056) 134.04 (0.708414) 2.24% > >>>> Percent CPU 3934 (19.7104) 3937.2 (19.0184) 0.08% > >>>> Context Switches 92125.4 (576.787) 92581.6 (198.622) 0.50% > >>>> Sleeps 317923 (652.499) 318469 (1255.59) 0.17% > >>>> > >>>> Average Optimal load -j 80 > >>>> Run (std deviation) > >>>> BASE SPF > >>>> Elapsed Time 107.73 (0.632416) 107.31 (0.584936) -0.39% > >>>> User Time 5869.86 (1466.72) 5871.71 (1467.27) 0.03% > >>>> System Time 153.728 (23.8573) 157.153 (24.3704) 2.23% > >>>> Percent CPU 5418.6 (1565.17) 5436.7 (1580.91) 0.33% > >>>> Context Switches 223861 (138865) 225032 (139632) 0.52% > >>>> Sleeps 330529 (13495.1) 332001 (14746.2) 0.45% > >>>> > >>>> During a run on the SPF, perf events were captured: > >>>> Performance counter stats for '../kernbench -M': > >>>> 116730856 faults > >>>> 0 spf > >>>> 3 pagefault:spf_vma_changed > >>>> 0 pagefault:spf_vma_noanon > >>>> 476 pagefault:spf_vma_notsup > >>>> 0 pagefault:spf_vma_access > >>>> 0 pagefault:spf_pmd_changed > >>>> > >>>> Most of the processes involved are monothreaded so SPF is not activated but > >>>> there is no impact on the performance. > >>>> > >>>> Ebizzy: > >>>> ------- > >>>> The test is counting the number of records per second it can manage, the > >>>> higher is the best. I run it like this 'ebizzy -mTt <nrcpus>'. To get > >>>> consistent result I repeated the test 100 times and measure the average > >>>> result. The number is the record processes per second, the higher is the > >>>> best. > >>>> > >>>> BASE SPF delta > >>>> 16 CPUs x86 VM 742.57 1490.24 100.69% > >>>> 80 CPUs P8 node 13105.4 24174.23 84.46% > >>>> > >>>> Here are the performance counter read during a run on a 16 CPUs x86 VM: > >>>> Performance counter stats for './ebizzy -mTt 16': > >>>> 1706379 faults > >>>> 1674599 spf > >>>> 30588 pagefault:spf_vma_changed > >>>> 0 pagefault:spf_vma_noanon > >>>> 363 pagefault:spf_vma_notsup > >>>> 0 pagefault:spf_vma_access > >>>> 0 pagefault:spf_pmd_changed > >>>> > >>>> And the ones captured during a run on a 80 CPUs Power node: > >>>> Performance counter stats for './ebizzy -mTt 80': > >>>> 1874773 faults > >>>> 1461153 spf > >>>> 413293 pagefault:spf_vma_changed > >>>> 0 pagefault:spf_vma_noanon > >>>> 200 pagefault:spf_vma_notsup > >>>> 0 pagefault:spf_vma_access > >>>> 0 pagefault:spf_pmd_changed > >>>> > >>>> In ebizzy's case most of the page fault were handled in a speculative way, > >>>> leading the ebizzy performance boost. > >>>> > >>>> ------------------ > >>>> Changes since v10 (https://lkml.org/lkml/2018/4/17/572): > >>>> - Accounted for all review feedbacks from Punit Agrawal, Ganesh Mahendran > >>>> and Minchan Kim, hopefully. > >>>> - Remove unneeded check on CONFIG_SPECULATIVE_PAGE_FAULT in > >>>> __do_page_fault(). > >>>> - Loop in pte_spinlock() and pte_map_lock() when pte try lock fails > >>>> instead > >>>> of aborting the speculative page fault handling. Dropping the now > >>>> useless > >>>> trace event pagefault:spf_pte_lock. > >>>> - No more try to reuse the fetched VMA during the speculative page fault > >>>> handling when retrying is needed. This adds a lot of complexity and > >>>> additional tests done didn't show a significant performance improvement. > >>>> - Convert IS_ENABLED(CONFIG_NUMA) back to #ifdef due to build error. > >>>> > >>>> [1] http://linux-kernel.2935.n7.nabble.com/RFC-PATCH-0-6-Another-go-at-speculative-page-faults-tt965642.html#none > >>>> [2] https://patchwork.kernel.org/patch/9999687/ > >>>> > >>>> > >>>> Laurent Dufour (20): > >>>> mm: introduce CONFIG_SPECULATIVE_PAGE_FAULT > >>>> x86/mm: define ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT > >>>> powerpc/mm: set ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT > >>>> mm: introduce pte_spinlock for FAULT_FLAG_SPECULATIVE > >>>> mm: make pte_unmap_same compatible with SPF > >>>> mm: introduce INIT_VMA() > >>>> mm: protect VMA modifications using VMA sequence count > >>>> mm: protect mremap() against SPF hanlder > >>>> mm: protect SPF handler against anon_vma changes > >>>> mm: cache some VMA fields in the vm_fault structure > >>>> mm/migrate: Pass vm_fault pointer to migrate_misplaced_page() > >>>> mm: introduce __lru_cache_add_active_or_unevictable > >>>> mm: introduce __vm_normal_page() > >>>> mm: introduce __page_add_new_anon_rmap() > >>>> mm: protect mm_rb tree with a rwlock > >>>> mm: adding speculative page fault failure trace events > >>>> perf: add a speculative page fault sw event > >>>> perf tools: add support for the SPF perf event > >>>> mm: add speculative page fault vmstats > >>>> powerpc/mm: add speculative page fault > >>>> > >>>> Mahendran Ganesh (2): > >>>> arm64/mm: define ARCH_SUPPORTS_SPECULATIVE_PAGE_FAULT > >>>> arm64/mm: add speculative page fault > >>>> > >>>> Peter Zijlstra (4): > >>>> mm: prepare for FAULT_FLAG_SPECULATIVE > >>>> mm: VMA sequence count > >>>> mm: provide speculative fault infrastructure > >>>> x86/mm: add speculative pagefault handling > >>>> > >>>> arch/arm64/Kconfig | 1 + > >>>> arch/arm64/mm/fault.c | 12 + > >>>> arch/powerpc/Kconfig | 1 + > >>>> arch/powerpc/mm/fault.c | 16 + > >>>> arch/x86/Kconfig | 1 + > >>>> arch/x86/mm/fault.c | 27 +- > >>>> fs/exec.c | 2 +- > >>>> fs/proc/task_mmu.c | 5 +- > >>>> fs/userfaultfd.c | 17 +- > >>>> include/linux/hugetlb_inline.h | 2 +- > >>>> include/linux/migrate.h | 4 +- > >>>> include/linux/mm.h | 136 +++++++- > >>>> include/linux/mm_types.h | 7 + > >>>> include/linux/pagemap.h | 4 +- > >>>> include/linux/rmap.h | 12 +- > >>>> include/linux/swap.h | 10 +- > >>>> include/linux/vm_event_item.h | 3 + > >>>> include/trace/events/pagefault.h | 80 +++++ > >>>> include/uapi/linux/perf_event.h | 1 + > >>>> kernel/fork.c | 5 +- > >>>> mm/Kconfig | 22 ++ > >>>> mm/huge_memory.c | 6 +- > >>>> mm/hugetlb.c | 2 + > >>>> mm/init-mm.c | 3 + > >>>> mm/internal.h | 20 ++ > >>>> mm/khugepaged.c | 5 + > >>>> mm/madvise.c | 6 +- > >>>> mm/memory.c | 612 +++++++++++++++++++++++++++++----- > >>>> mm/mempolicy.c | 51 ++- > >>>> mm/migrate.c | 6 +- > >>>> mm/mlock.c | 13 +- > >>>> mm/mmap.c | 229 ++++++++++--- > >>>> mm/mprotect.c | 4 +- > >>>> mm/mremap.c | 13 + > >>>> mm/nommu.c | 2 +- > >>>> mm/rmap.c | 5 +- > >>>> mm/swap.c | 6 +- > >>>> mm/swap_state.c | 8 +- > >>>> mm/vmstat.c | 5 +- > >>>> tools/include/uapi/linux/perf_event.h | 1 + > >>>> tools/perf/util/evsel.c | 1 + > >>>> tools/perf/util/parse-events.c | 4 + > >>>> tools/perf/util/parse-events.l | 1 + > >>>> tools/perf/util/python.c | 1 + > >>>> 44 files changed, 1161 insertions(+), 211 deletions(-) > >>>> create mode 100644 include/trace/events/pagefault.h > >>>> > >>>> -- > >>>> 2.7.4 > >>>> > >>>> > >>> > >> > > >
========================================================================================= tbox_group/testcase/rootfs/kconfig/compiler/nr_task/thp_enabled/test/cpufreq_governor: lkp-skl-4sp1/will-it-scale/debian-x86_64-2018-04-03.cgz/x86_64-rhel-7.2/gcc-7/100%/always/page_fault3/performance commit: ba98a1cdad71d259a194461b3a61471b49b14df1 a7a8993bfe3ccb54ad468b9f1799649e4ad1ff12 ba98a1cdad71d259 a7a8993bfe3ccb54ad468b9f17 ---------------- -------------------------- fail:runs %reproduction fail:runs | | | 44:3 -13% 43:3 perf-profile.calltrace.cycles-pp.error_entry 22:3 -6% 22:3 perf-profile.calltrace.cycles-pp.sync_regs.error_entry 44:3 -13% 44:3 perf-profile.children.cycles-pp.error_entry 21:3 -7% 21:3 perf-profile.self.cycles-pp.error_entry %stddev %change %stddev \ | \ 10519 ± 3% -20.5% 8368 ± 6% will-it-scale.per_thread_ops 118098 +11.2% 131287 ± 2% will-it-scale.time.involuntary_context_switches 6.084e+08 ± 3% -20.4% 4.845e+08 ± 6% will-it-scale.time.minor_page_faults 7467 +5.0% 7841 will-it-scale.time.percent_of_cpu_this_job_got 44922 +5.0% 47176 will-it-scale.time.system_time 7126337 ± 3% -15.4% 6025689 ± 6% will-it-scale.time.voluntary_context_switches 91905646 -1.3% 90673935 will-it-scale.workload 27.15 ± 6% -8.7% 24.80 ± 10% boot-time.boot 2516213 ± 6% +8.3% 2726303 interrupts.CAL:Function_call_interrupts 388.00 ± 9% +60.2% 621.67 ± 20% irq_exception_noise.softirq_nr 11.28 ± 2% -1.9 9.37 ± 4% mpstat.cpu.idle% 10065 ±140% +243.4% 34559 ± 4% numa-numastat.node0.other_node 18739 -11.6% 16573 ± 3% uptime.idle 29406 ± 2% -11.8% 25929 ± 5% vmstat.system.cs 329614 ± 8% +17.0% 385618 ± 10% meminfo.DirectMap4k 237851 +21.2% 288160 ± 5% meminfo.Inactive 237615 +21.2% 287924 ± 5% meminfo.Inactive(anon) 7917847 -10.7% 7071860 softirqs.RCU 4784181 ± 3% -14.5% 4089039 ± 4% softirqs.SCHED 45666107 ± 7% +12.9% 51535472 ± 3% softirqs.TIMER 2.617e+09 ± 2% -13.9% 2.253e+09 ± 6% cpuidle.C1E.time 6688774 ± 2% -12.8% 5835101 ± 5% cpuidle.C1E.usage 1.022e+10 ± 2% -18.0% 8.376e+09 ± 3% cpuidle.C6.time 13440993 ± 2% -16.3% 11243794 ± 4% cpuidle.C6.usage 54781 ± 16% +37.5% 75347 ± 12% numa-meminfo.node0.Inactive 54705 ± 16% +37.7% 75347 ± 12% numa-meminfo.node0.Inactive(anon) 52522 +35.0% 70886 ± 6% numa-meminfo.node2.Inactive 52443 +34.7% 70653 ± 6% numa-meminfo.node2.Inactive(anon) 31046 ± 6% +30.3% 40457 ± 11% numa-meminfo.node2.SReclaimable 58563 +21.1% 70945 ± 6% proc-vmstat.nr_inactive_anon 58564 +21.1% 70947 ± 6% proc-vmstat.nr_zone_inactive_anon 69701118 -1.2% 68842151 proc-vmstat.pgalloc_normal 2.765e+10 -1.3% 2.729e+10 proc-vmstat.pgfault 69330418 -1.2% 68466824 proc-vmstat.pgfree 118098 +11.2% 131287 ± 2% time.involuntary_context_switches 6.084e+08 ± 3% -20.4% 4.845e+08 ± 6% time.minor_page_faults 7467 +5.0% 7841 time.percent_of_cpu_this_job_got 44922 +5.0% 47176 time.system_time 7126337 ± 3% -15.4% 6025689 ± 6% time.voluntary_context_switches 13653 ± 16% +33.5% 18225 ± 12% numa-vmstat.node0.nr_inactive_anon 13651 ± 16% +33.5% 18224 ± 12% numa-vmstat.node0.nr_zone_inactive_anon 13069 ± 3% +30.1% 17001 ± 4% numa-vmstat.node2.nr_inactive_anon 134.67 ± 42% -49.5% 68.00 ± 31% numa-vmstat.node2.nr_mlock 7758 ± 6% +30.4% 10112 ± 11% numa-vmstat.node2.nr_slab_reclaimable 13066 ± 3% +30.1% 16998 ± 4% numa-vmstat.node2.nr_zone_inactive_anon 1039 ± 11% -17.5% 857.33 slabinfo.Acpi-ParseExt.active_objs 1039 ± 11% -17.5% 857.33 slabinfo.Acpi-ParseExt.num_objs 2566 ± 6% -8.8% 2340 ± 5% slabinfo.biovec-64.active_objs 2566 ± 6% -8.8% 2340 ± 5% slabinfo.biovec-64.num_objs 898.33 ± 3% -9.5% 813.33 ± 3% slabinfo.kmem_cache_node.active_objs 1066 ± 2% -8.0% 981.33 ± 3% slabinfo.kmem_cache_node.num_objs 1940 +2.3% 1984 turbostat.Avg_MHz 6679037 ± 2% -12.7% 5830270 ± 5% turbostat.C1E 2.25 ± 2% -0.3 1.94 ± 6% turbostat.C1E% 13418115 -16.3% 11234510 ± 4% turbostat.C6 8.75 ± 2% -1.6 7.18 ± 3% turbostat.C6% 5.99 ± 2% -14.4% 5.13 ± 4% turbostat.CPU%c1 5.01 ± 3% -20.1% 4.00 ± 4% turbostat.CPU%c6 1.77 ± 3% -34.7% 1.15 turbostat.Pkg%pc2 1.378e+13 +1.2% 1.394e+13 perf-stat.branch-instructions 0.98 -0.0 0.94 perf-stat.branch-miss-rate% 1.344e+11 -2.3% 1.313e+11 perf-stat.branch-misses 1.076e+11 -1.8% 1.057e+11 perf-stat.cache-misses 2.258e+11 -2.1% 2.21e+11 perf-stat.cache-references 17788064 ± 2% -11.9% 15674207 ± 6% perf-stat.context-switches 2.241e+14 +2.4% 2.294e+14 perf-stat.cpu-cycles 1.929e+13 +2.2% 1.971e+13 perf-stat.dTLB-loads 4.01 -0.2 3.83 perf-stat.dTLB-store-miss-rate% 4.519e+11 -1.3% 4.461e+11 perf-stat.dTLB-store-misses 1.082e+13 +3.6% 1.121e+13 perf-stat.dTLB-stores 3.02e+10 +23.2% 3.721e+10 ± 3% perf-stat.iTLB-load-misses 2.721e+08 ± 8% -8.8% 2.481e+08 ± 3% perf-stat.iTLB-loads 6.985e+13 +1.8% 7.111e+13 perf-stat.instructions 2313 -17.2% 1914 ± 3% perf-stat.instructions-per-iTLB-miss 2.764e+10 -1.3% 2.729e+10 perf-stat.minor-faults 1.421e+09 ± 2% -16.4% 1.188e+09 ± 9% perf-stat.node-load-misses 1.538e+10 -9.3% 1.395e+10 perf-stat.node-loads 9.75 +1.4 11.10 perf-stat.node-store-miss-rate% 3.012e+09 +14.1% 3.437e+09 perf-stat.node-store-misses 2.789e+10 -1.3% 2.753e+10 perf-stat.node-stores 2.764e+10 -1.3% 2.729e+10 perf-stat.page-faults 760059 +3.2% 784235 perf-stat.path-length 193545 ± 25% -57.8% 81757 ± 46% sched_debug.cfs_rq:/.MIN_vruntime.avg 26516863 ± 19% -49.7% 13338070 ± 33% sched_debug.cfs_rq:/.MIN_vruntime.max 2202271 ± 21% -53.2% 1029581 ± 38% sched_debug.cfs_rq:/.MIN_vruntime.stddev 193545 ± 25% -57.8% 81757 ± 46% sched_debug.cfs_rq:/.max_vruntime.avg 26516863 ± 19% -49.7% 13338070 ± 33% sched_debug.cfs_rq:/.max_vruntime.max 2202271 ± 21% -53.2% 1029581 ± 38% sched_debug.cfs_rq:/.max_vruntime.stddev 0.32 ± 70% +253.2% 1.14 ± 54% sched_debug.cfs_rq:/.removed.load_avg.avg 4.44 ± 70% +120.7% 9.80 ± 27% sched_debug.cfs_rq:/.removed.load_avg.stddev 14.90 ± 70% +251.0% 52.31 ± 53% sched_debug.cfs_rq:/.removed.runnable_sum.avg 205.71 ± 70% +119.5% 451.60 ± 27% sched_debug.cfs_rq:/.removed.runnable_sum.stddev 0.16 ± 70% +237.9% 0.54 ± 50% sched_debug.cfs_rq:/.removed.util_avg.avg 2.23 ± 70% +114.2% 4.77 ± 24% sched_debug.cfs_rq:/.removed.util_avg.stddev 573.70 ± 5% -9.7% 518.06 ± 6% sched_debug.cfs_rq:/.util_avg.min 114.87 ± 8% +14.1% 131.04 ± 10% sched_debug.cfs_rq:/.util_est_enqueued.avg 64.42 ± 54% -63.9% 23.27 ± 68% sched_debug.cpu.cpu_load[1].max 5.05 ± 48% -55.2% 2.26 ± 51% sched_debug.cpu.cpu_load[1].stddev 57.58 ± 59% -60.3% 22.88 ± 70% sched_debug.cpu.cpu_load[2].max 21019 ± 3% -15.1% 17841 ± 6% sched_debug.cpu.nr_switches.min 20797 ± 3% -15.0% 17670 ± 6% sched_debug.cpu.sched_count.min 10287 ± 3% -15.1% 8736 ± 6% sched_debug.cpu.sched_goidle.avg 13693 ± 2% -10.7% 12233 ± 5% sched_debug.cpu.sched_goidle.max 9976 ± 3% -16.0% 8381 ± 7% sched_debug.cpu.sched_goidle.min 0.00 ± 26% +98.9% 0.00 ± 28% sched_debug.rt_rq:/.rt_time.min 4230 ±141% -100.0% 0.00 latency_stats.avg.trace_module_notify.notifier_call_chain.blocking_notifier_call_chain.do_init_module.load_module.__do_sys_finit_module.do_syscall_64.entry_SYSCALL_64_after_hwframe 28498 ±141% -100.0% 0.00 latency_stats.avg.perf_event_alloc.__do_sys_perf_event_open.do_syscall_64.entry_SYSCALL_64_after_hwframe 4065 ±138% -92.2% 315.33 ± 91% latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup_revalidate.lookup_fast.walk_component.link_path_walk.path_lookupat.filename_lookup 0.00 +3.6e+105% 3641 ±141% latency_stats.avg.down.console_lock.console_device.tty_lookup_driver.tty_open.chrdev_open.do_dentry_open.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.00 +2.5e+106% 25040 ±141% latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.00 +3.4e+106% 34015 ±141% latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_get_acl.get_acl.posix_acl_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open 0.00 +4.8e+106% 47686 ±141% latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_do_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open.do_syscall_64 4230 ±141% -100.0% 0.00 latency_stats.max.trace_module_notify.notifier_call_chain.blocking_notifier_call_chain.do_init_module.load_module.__do_sys_finit_module.do_syscall_64.entry_SYSCALL_64_after_hwframe 28498 ±141% -100.0% 0.00 latency_stats.max.perf_event_alloc.__do_sys_perf_event_open.do_syscall_64.entry_SYSCALL_64_after_hwframe 4065 ±138% -92.2% 315.33 ± 91% latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup_revalidate.lookup_fast.walk_component.link_path_walk.path_lookupat.filename_lookup 4254 ±134% -88.0% 511.67 ± 90% latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_getattr.__nfs_revalidate_inode.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat 43093 ± 35% +76.6% 76099 ±115% latency_stats.max.blk_execute_rq.scsi_execute.ioctl_internal_command.scsi_set_medium_removal.cdrom_release.[cdrom].sr_block_release.[sr_mod].__blkdev_put.blkdev_close.__fput.task_work_run.exit_to_usermode_loop.do_syscall_64 24139 ± 70% +228.5% 79285 ±105% latency_stats.max.blk_execute_rq.scsi_execute.scsi_test_unit_ready.sr_check_events.[sr_mod].cdrom_check_events.[cdrom].sr_block_check_events.[sr_mod].disk_check_events.disk_clear_events.check_disk_change.sr_block_open.[sr_mod].__blkdev_get.blkdev_get 0.00 +3.6e+105% 3641 ±141% latency_stats.max.down.console_lock.console_device.tty_lookup_driver.tty_open.chrdev_open.do_dentry_open.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.00 +2.5e+106% 25040 ±141% latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.00 +3.4e+106% 34015 ±141% latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_get_acl.get_acl.posix_acl_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open 0.00 +6.5e+106% 64518 ±141% latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_do_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open.do_syscall_64 4230 ±141% -100.0% 0.00 latency_stats.sum.trace_module_notify.notifier_call_chain.blocking_notifier_call_chain.do_init_module.load_module.__do_sys_finit_module.do_syscall_64.entry_SYSCALL_64_after_hwframe 28498 ±141% -100.0% 0.00 latency_stats.sum.perf_event_alloc.__do_sys_perf_event_open.do_syscall_64.entry_SYSCALL_64_after_hwframe 4065 ±138% -92.2% 315.33 ± 91% latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup_revalidate.lookup_fast.walk_component.link_path_walk.path_lookupat.filename_lookup 57884 ± 9% +47.3% 85264 ±118% latency_stats.sum.blk_execute_rq.scsi_execute.ioctl_internal_command.scsi_set_medium_removal.cdrom_release.[cdrom].sr_block_release.[sr_mod].__blkdev_put.blkdev_close.__fput.task_work_run.exit_to_usermode_loop.do_syscall_64 0.00 +3.6e+105% 3641 ±141% latency_stats.sum.down.console_lock.console_device.tty_lookup_driver.tty_open.chrdev_open.do_dentry_open.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.00 +2.5e+106% 25040 ±141% latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.00 +3.4e+106% 34015 ±141% latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_get_acl.get_acl.posix_acl_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open 0.00 +9.5e+106% 95373 ±141% latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_do_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open.do_syscall_64 11.70 -11.7 0.00 perf-profile.calltrace.cycles-pp.__do_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault 11.52 -11.5 0.00 perf-profile.calltrace.cycles-pp.shmem_fault.__do_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault 10.44 -10.4 0.00 perf-profile.calltrace.cycles-pp.shmem_getpage_gfp.shmem_fault.__do_fault.__handle_mm_fault.handle_mm_fault 9.83 -9.8 0.00 perf-profile.calltrace.cycles-pp.find_lock_entry.shmem_getpage_gfp.shmem_fault.__do_fault.__handle_mm_fault 9.55 -9.5 0.00 perf-profile.calltrace.cycles-pp.finish_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault 9.35 -9.3 0.00 perf-profile.calltrace.cycles-pp.alloc_set_pte.finish_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault 6.81 -6.8 0.00 perf-profile.calltrace.cycles-pp.page_add_file_rmap.alloc_set_pte.finish_fault.__handle_mm_fault.handle_mm_fault 7.71 -0.3 7.45 perf-profile.calltrace.cycles-pp.find_get_entry.find_lock_entry.shmem_getpage_gfp.shmem_fault.__do_fault 0.59 ± 7% -0.2 0.35 ± 70% perf-profile.calltrace.cycles-pp.smp_apic_timer_interrupt.apic_timer_interrupt.__do_page_fault.do_page_fault.page_fault 0.59 ± 7% -0.2 0.35 ± 70% perf-profile.calltrace.cycles-pp.apic_timer_interrupt.__do_page_fault.do_page_fault.page_fault 10.41 -0.2 10.24 perf-profile.calltrace.cycles-pp.native_irq_return_iret 7.68 -0.1 7.60 perf-profile.calltrace.cycles-pp.swapgs_restore_regs_and_return_to_usermode 0.76 -0.1 0.70 perf-profile.calltrace.cycles-pp.down_read_trylock.__do_page_fault.do_page_fault.page_fault 1.38 -0.0 1.34 perf-profile.calltrace.cycles-pp.do_page_fault 1.05 -0.0 1.02 perf-profile.calltrace.cycles-pp.trace_graph_entry.do_page_fault 0.92 +0.0 0.94 perf-profile.calltrace.cycles-pp.find_vma.__do_page_fault.do_page_fault.page_fault 0.91 +0.0 0.93 perf-profile.calltrace.cycles-pp.vmacache_find.find_vma.__do_page_fault.do_page_fault.page_fault 0.65 +0.0 0.67 perf-profile.calltrace.cycles-pp.set_page_dirty.unmap_page_range.unmap_vmas.unmap_region.do_munmap 0.62 +0.0 0.66 perf-profile.calltrace.cycles-pp.page_mapping.find_lock_entry.shmem_getpage_gfp.shmem_fault.__do_fault 4.15 +0.1 4.27 perf-profile.calltrace.cycles-pp.page_remove_rmap.unmap_page_range.unmap_vmas.unmap_region.do_munmap 10.17 +0.2 10.39 perf-profile.calltrace.cycles-pp.munmap 9.56 +0.2 9.78 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.munmap 9.56 +0.2 9.78 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.munmap 9.56 +0.2 9.78 perf-profile.calltrace.cycles-pp.unmap_region.do_munmap.vm_munmap.__x64_sys_munmap.do_syscall_64 9.54 +0.2 9.76 perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.do_munmap.vm_munmap 9.54 +0.2 9.76 perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.do_munmap.vm_munmap.__x64_sys_munmap 9.56 +0.2 9.78 perf-profile.calltrace.cycles-pp.do_munmap.vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe 9.56 +0.2 9.78 perf-profile.calltrace.cycles-pp.vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.munmap 9.56 +0.2 9.78 perf-profile.calltrace.cycles-pp.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.munmap 0.00 +0.6 0.56 ± 2% perf-profile.calltrace.cycles-pp.lock_page_memcg.page_add_file_rmap.alloc_set_pte.finish_fault.handle_pte_fault 0.00 +0.6 0.59 perf-profile.calltrace.cycles-pp.page_mapping.set_page_dirty.fault_dirty_shared_page.handle_pte_fault.__handle_mm_fault 0.00 +0.6 0.60 perf-profile.calltrace.cycles-pp.current_time.file_update_time.handle_pte_fault.__handle_mm_fault.handle_mm_fault 0.00 +0.7 0.68 perf-profile.calltrace.cycles-pp.___might_sleep.find_lock_entry.shmem_getpage_gfp.shmem_fault.__do_fault 0.00 +0.7 0.74 perf-profile.calltrace.cycles-pp.unlock_page.fault_dirty_shared_page.handle_pte_fault.__handle_mm_fault.handle_mm_fault 0.00 +0.8 0.80 perf-profile.calltrace.cycles-pp.set_page_dirty.fault_dirty_shared_page.handle_pte_fault.__handle_mm_fault.handle_mm_fault 0.00 +0.9 0.88 perf-profile.calltrace.cycles-pp._raw_spin_lock.pte_map_lock.alloc_set_pte.finish_fault.handle_pte_fault 0.00 +0.9 0.91 perf-profile.calltrace.cycles-pp.__set_page_dirty_no_writeback.fault_dirty_shared_page.handle_pte_fault.__handle_mm_fault.handle_mm_fault 0.00 +1.3 1.27 perf-profile.calltrace.cycles-pp.pte_map_lock.alloc_set_pte.finish_fault.handle_pte_fault.__handle_mm_fault 0.00 +1.3 1.30 perf-profile.calltrace.cycles-pp.file_update_time.handle_pte_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault 0.00 +2.8 2.76 perf-profile.calltrace.cycles-pp.fault_dirty_shared_page.handle_pte_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault 0.00 +6.8 6.81 perf-profile.calltrace.cycles-pp.page_add_file_rmap.alloc_set_pte.finish_fault.handle_pte_fault.__handle_mm_fault 0.00 +9.4 9.39 perf-profile.calltrace.cycles-pp.alloc_set_pte.finish_fault.handle_pte_fault.__handle_mm_fault.handle_mm_fault 0.00 +9.6 9.59 perf-profile.calltrace.cycles-pp.finish_fault.handle_pte_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault 0.00 +9.8 9.77 perf-profile.calltrace.cycles-pp.find_lock_entry.shmem_getpage_gfp.shmem_fault.__do_fault.handle_pte_fault 0.00 +10.4 10.37 perf-profile.calltrace.cycles-pp.shmem_getpage_gfp.shmem_fault.__do_fault.handle_pte_fault.__handle_mm_fault 0.00 +11.5 11.46 perf-profile.calltrace.cycles-pp.shmem_fault.__do_fault.handle_pte_fault.__handle_mm_fault.handle_mm_fault 0.00 +11.6 11.60 perf-profile.calltrace.cycles-pp.__do_fault.handle_pte_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault 0.00 +26.6 26.62 perf-profile.calltrace.cycles-pp.handle_pte_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault 7.88 -0.3 7.61 perf-profile.children.cycles-pp.find_get_entry 1.34 ± 8% -0.2 1.16 ± 2% perf-profile.children.cycles-pp.hrtimer_interrupt 10.41 -0.2 10.24 perf-profile.children.cycles-pp.native_irq_return_iret 0.38 ± 28% -0.1 0.26 ± 4% perf-profile.children.cycles-pp.tick_sched_timer 11.80 -0.1 11.68 perf-profile.children.cycles-pp.__do_fault 0.55 ± 15% -0.1 0.43 ± 2% perf-profile.children.cycles-pp.__hrtimer_run_queues 0.60 -0.1 0.51 perf-profile.children.cycles-pp.pmd_devmap_trans_unstable 0.38 ± 13% -0.1 0.29 ± 4% perf-profile.children.cycles-pp.ktime_get 7.68 -0.1 7.60 perf-profile.children.cycles-pp.swapgs_restore_regs_and_return_to_usermode 5.18 -0.1 5.12 perf-profile.children.cycles-pp.trace_graph_entry 0.79 -0.1 0.73 perf-profile.children.cycles-pp.down_read_trylock 7.83 -0.1 7.76 perf-profile.children.cycles-pp.sync_regs 3.01 -0.1 2.94 perf-profile.children.cycles-pp.fault_dirty_shared_page 1.02 -0.1 0.96 perf-profile.children.cycles-pp._raw_spin_lock 4.66 -0.1 4.61 perf-profile.children.cycles-pp.prepare_ftrace_return 0.37 ± 8% -0.1 0.32 ± 3% perf-profile.children.cycles-pp.current_kernel_time64 5.26 -0.1 5.21 perf-profile.children.cycles-pp.ftrace_graph_caller 0.66 ± 5% -0.1 0.61 perf-profile.children.cycles-pp.current_time 0.18 ± 5% -0.0 0.15 ± 3% perf-profile.children.cycles-pp.update_process_times 0.27 -0.0 0.26 perf-profile.children.cycles-pp._cond_resched 0.16 -0.0 0.15 ± 3% perf-profile.children.cycles-pp.rcu_all_qs 0.94 +0.0 0.95 perf-profile.children.cycles-pp.vmacache_find 0.48 +0.0 0.50 perf-profile.children.cycles-pp.__mod_node_page_state 0.17 +0.0 0.19 ± 2% perf-profile.children.cycles-pp.__unlock_page_memcg 1.07 +0.0 1.10 perf-profile.children.cycles-pp.find_vma 0.79 ± 3% +0.1 0.86 ± 2% perf-profile.children.cycles-pp.lock_page_memcg 4.29 +0.1 4.40 perf-profile.children.cycles-pp.page_remove_rmap 1.39 ± 2% +0.1 1.52 perf-profile.children.cycles-pp.file_update_time 0.00 +0.2 0.16 perf-profile.children.cycles-pp.__vm_normal_page 9.63 +0.2 9.84 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe 9.63 +0.2 9.84 perf-profile.children.cycles-pp.do_syscall_64 9.63 +0.2 9.84 perf-profile.children.cycles-pp.unmap_page_range 10.17 +0.2 10.39 perf-profile.children.cycles-pp.munmap 9.56 +0.2 9.78 perf-profile.children.cycles-pp.unmap_region 9.56 +0.2 9.78 perf-profile.children.cycles-pp.do_munmap 9.56 +0.2 9.78 perf-profile.children.cycles-pp.vm_munmap 9.56 +0.2 9.78 perf-profile.children.cycles-pp.__x64_sys_munmap 9.54 +0.2 9.77 perf-profile.children.cycles-pp.unmap_vmas 1.01 +0.2 1.25 perf-profile.children.cycles-pp.___might_sleep 0.00 +1.6 1.59 perf-profile.children.cycles-pp.pte_map_lock 0.00 +26.9 26.89 perf-profile.children.cycles-pp.handle_pte_fault 4.25 -1.0 3.24 perf-profile.self.cycles-pp.__handle_mm_fault 1.42 -0.3 1.11 perf-profile.self.cycles-pp.alloc_set_pte 4.87 -0.3 4.59 perf-profile.self.cycles-pp.find_get_entry 10.41 -0.2 10.24 perf-profile.self.cycles-pp.native_irq_return_iret 0.37 ± 13% -0.1 0.28 ± 4% perf-profile.self.cycles-pp.ktime_get 0.60 -0.1 0.51 perf-profile.self.cycles-pp.pmd_devmap_trans_unstable 7.50 -0.1 7.42 perf-profile.self.cycles-pp.swapgs_restore_regs_and_return_to_usermode 7.83 -0.1 7.76 perf-profile.self.cycles-pp.sync_regs 4.85 -0.1 4.79 perf-profile.self.cycles-pp.trace_graph_entry 1.01 -0.1 0.95 perf-profile.self.cycles-pp._raw_spin_lock 0.78 -0.1 0.73 perf-profile.self.cycles-pp.down_read_trylock 0.36 ± 9% -0.1 0.31 ± 4% perf-profile.self.cycles-pp.current_kernel_time64 0.28 -0.0 0.23 ± 2% perf-profile.self.cycles-pp.__do_fault 1.04 -0.0 1.00 perf-profile.self.cycles-pp.find_lock_entry 0.30 -0.0 0.28 ± 3% perf-profile.self.cycles-pp.fault_dirty_shared_page 0.70 -0.0 0.67 perf-profile.self.cycles-pp.prepare_ftrace_return 0.44 -0.0 0.42 perf-profile.self.cycles-pp.do_page_fault 0.16 -0.0 0.14 perf-profile.self.cycles-pp.rcu_all_qs 0.78 -0.0 0.77 perf-profile.self.cycles-pp.shmem_getpage_gfp 0.20 -0.0 0.19 perf-profile.self.cycles-pp._cond_resched 0.50 +0.0 0.51 perf-profile.self.cycles-pp.set_page_dirty 0.93 +0.0 0.95 perf-profile.self.cycles-pp.vmacache_find 0.36 ± 2% +0.0 0.38 perf-profile.self.cycles-pp.__might_sleep 0.47 +0.0 0.50 perf-profile.self.cycles-pp.__mod_node_page_state 0.17 +0.0 0.19 ± 2% perf-profile.self.cycles-pp.__unlock_page_memcg 2.34 +0.0 2.38 perf-profile.self.cycles-pp.unmap_page_range 0.78 ± 3% +0.1 0.85 ± 2% perf-profile.self.cycles-pp.lock_page_memcg 2.17 +0.1 2.24 perf-profile.self.cycles-pp.__do_page_fault 0.00 +0.2 0.16 ± 3% perf-profile.self.cycles-pp.__vm_normal_page 1.00 +0.2 1.24 perf-profile.self.cycles-pp.___might_sleep 0.00 +0.7 0.70 perf-profile.self.cycles-pp.pte_map_lock 0.00 +1.4 1.42 ± 2% perf-profile.self.cycles-pp.handle_pte_fault ========================================================================================= tbox_group/testcase/rootfs/kconfig/compiler/nr_task/thp_enabled/test/cpufreq_governor: lkp-skl-4sp1/will-it-scale/debian-x86_64-2018-04-03.cgz/x86_64-rhel-7.2/gcc-7/100%/never/context_switch1/performance commit: ba98a1cdad71d259a194461b3a61471b49b14df1 a7a8993bfe3ccb54ad468b9f1799649e4ad1ff12 ba98a1cdad71d259 a7a8993bfe3ccb54ad468b9f17 ---------------- -------------------------- fail:runs %reproduction fail:runs | | | :3 33% 1:3 dmesg.WARNING:at#for_ip_interrupt_entry/0x 2:3 -67% :3 kmsg.pstore:crypto_comp_decompress_failed,ret= 2:3 -67% :3 kmsg.pstore:decompression_failed %stddev %change %stddev \ | \ 224431 -1.3% 221567 will-it-scale.per_process_ops 237006 -2.2% 231907 will-it-scale.per_thread_ops 1.601e+09 ± 29% -46.9% 8.501e+08 ± 12% will-it-scale.time.involuntary_context_switches 5429 -1.6% 5344 will-it-scale.time.user_time 88596221 -1.7% 87067269 will-it-scale.workload 6863 ± 6% -9.7% 6200 boot-time.idle 144908 ± 40% -66.8% 48173 ± 93% meminfo.CmaFree 0.00 ± 70% +0.0 0.00 mpstat.cpu.iowait% 448336 ± 14% -34.8% 292125 ± 3% turbostat.C1 7684 ± 6% -9.5% 6957 uptime.idle 1.601e+09 ± 29% -46.9% 8.501e+08 ± 12% time.involuntary_context_switches 5429 -1.6% 5344 time.user_time 44013162 -1.7% 43243125 vmstat.system.cs 207684 -1.1% 205485 vmstat.system.in 2217033 ± 15% -15.8% 1866876 ± 2% cpuidle.C1.time 451218 ± 14% -34.7% 294841 ± 2% cpuidle.C1.usage 24839 ± 10% -19.9% 19896 cpuidle.POLL.time 7656 ± 11% -38.9% 4676 ± 8% cpuidle.POLL.usage 5.48 ± 49% -67.3% 1.79 ±100% irq_exception_noise.__do_page_fault.95th 9.46 ± 21% -58.2% 3.95 ± 64% irq_exception_noise.__do_page_fault.99th 35.67 ± 8% +1394.4% 533.00 ± 96% irq_exception_noise.irq_nr 52109 ± 3% -16.0% 43784 ± 4% irq_exception_noise.softirq_time 36226 ± 40% -66.7% 12048 ± 93% proc-vmstat.nr_free_cma 25916 -1.0% 25659 proc-vmstat.nr_slab_reclaimable 16279 ± 8% +2646.1% 447053 ± 82% proc-vmstat.pgalloc_movable 2231117 -18.4% 1820828 ± 20% proc-vmstat.pgalloc_normal 1109316 ± 46% -86.9% 145207 ±109% numa-numastat.node1.local_node 1114700 ± 45% -84.5% 172877 ± 85% numa-numastat.node1.numa_hit 5523 ±140% +402.8% 27768 ± 39% numa-numastat.node1.other_node 29013 ± 29% +3048.1% 913379 ± 73% numa-numastat.node3.local_node 65032 ± 13% +1335.1% 933270 ± 70% numa-numastat.node3.numa_hit 36018 -44.8% 19897 ± 75% numa-numastat.node3.other_node 12.79 ± 21% +7739.1% 1002 ±136% sched_debug.cpu.cpu_load[1].max 1.82 ± 10% +3901.1% 72.92 ±135% sched_debug.cpu.cpu_load[1].stddev 1.71 ± 4% +5055.8% 88.08 ±137% sched_debug.cpu.cpu_load[2].stddev 12.33 ± 23% +9061.9% 1129 ±139% sched_debug.cpu.cpu_load[3].max 1.78 ± 10% +4514.8% 82.18 ±137% sched_debug.cpu.cpu_load[3].stddev 4692 ± 72% +154.5% 11945 ± 29% sched_debug.cpu.max_idle_balance_cost.stddev 23979 -8.3% 21983 slabinfo.kmalloc-96.active_objs 1358 ± 6% -17.9% 1114 ± 3% slabinfo.nsproxy.active_objs 1358 ± 6% -17.9% 1114 ± 3% slabinfo.nsproxy.num_objs 15229 +12.4% 17119 slabinfo.pde_opener.active_objs 15229 +12.4% 17119 slabinfo.pde_opener.num_objs 59541 ± 8% -10.1% 53537 ± 8% slabinfo.vm_area_struct.active_objs 59612 ± 8% -10.1% 53604 ± 8% slabinfo.vm_area_struct.num_objs 4.163e+13 -1.4% 4.105e+13 perf-stat.branch-instructions 6.537e+11 -1.2% 6.459e+11 perf-stat.branch-misses 2.667e+10 -1.7% 2.621e+10 perf-stat.context-switches 1.21 +1.3% 1.22 perf-stat.cpi 150508 -9.8% 135825 ± 3% perf-stat.cpu-migrations 5.75 ± 33% +5.4 11.11 ± 26% perf-stat.iTLB-load-miss-rate% 3.619e+09 ± 36% +100.9% 7.272e+09 ± 30% perf-stat.iTLB-load-misses 2.089e+14 -1.3% 2.062e+14 perf-stat.instructions 64607 ± 29% -50.5% 31964 ± 37% perf-stat.instructions-per-iTLB-miss 0.83 -1.3% 0.82 perf-stat.ipc 3972 ± 4% -14.7% 3388 ± 8% numa-meminfo.node0.PageTables 207919 ± 25% -57.2% 88989 ± 74% numa-meminfo.node1.Active 207715 ± 26% -57.3% 88785 ± 74% numa-meminfo.node1.Active(anon) 356529 -34.3% 234069 ± 2% numa-meminfo.node1.FilePages 789129 ± 5% -19.8% 633161 ± 12% numa-meminfo.node1.MemUsed 34777 ± 8% -48.2% 18010 ± 30% numa-meminfo.node1.SReclaimable 69641 ± 4% -20.7% 55250 ± 12% numa-meminfo.node1.SUnreclaim 125526 ± 4% -96.3% 4602 ± 41% numa-meminfo.node1.Shmem 104419 -29.8% 73261 ± 16% numa-meminfo.node1.Slab 103661 ± 17% -72.0% 29029 ± 99% numa-meminfo.node2.Active 103661 ± 17% -72.2% 28829 ±101% numa-meminfo.node2.Active(anon) 103564 ± 18% -72.0% 29007 ±100% numa-meminfo.node2.AnonPages 671654 ± 7% -14.6% 573598 ± 4% numa-meminfo.node2.MemUsed 44206 ±127% +301.4% 177465 ± 42% numa-meminfo.node3.Active 44206 ±127% +301.0% 177263 ± 42% numa-meminfo.node3.Active(anon) 8738 +12.2% 9805 ± 8% numa-meminfo.node3.KernelStack 603605 ± 9% +27.8% 771554 ± 14% numa-meminfo.node3.MemUsed 14438 ± 6% +122.9% 32181 ± 42% numa-meminfo.node3.SReclaimable 2786 ±137% +3302.0% 94792 ± 71% numa-meminfo.node3.Shmem 71461 ± 7% +45.2% 103771 ± 29% numa-meminfo.node3.Slab 247197 ± 4% -7.8% 227843 numa-meminfo.node3.Unevictable 991.67 ± 4% -14.7% 846.00 ± 8% numa-vmstat.node0.nr_page_table_pages 51926 ± 26% -57.3% 22196 ± 74% numa-vmstat.node1.nr_active_anon 89137 -34.4% 58516 ± 2% numa-vmstat.node1.nr_file_pages 1679 ± 5% -10.8% 1498 ± 4% numa-vmstat.node1.nr_mapped 31386 ± 4% -96.3% 1150 ± 41% numa-vmstat.node1.nr_shmem 8694 ± 8% -48.2% 4502 ± 30% numa-vmstat.node1.nr_slab_reclaimable 17410 ± 4% -20.7% 13812 ± 12% numa-vmstat.node1.nr_slab_unreclaimable 51926 ± 26% -57.3% 22196 ± 74% numa-vmstat.node1.nr_zone_active_anon 1037174 ± 24% -57.0% 446205 ± 35% numa-vmstat.node1.numa_hit 961611 ± 26% -65.8% 328687 ± 50% numa-vmstat.node1.numa_local 75563 ± 44% +55.5% 117517 ± 9% numa-vmstat.node1.numa_other 25914 ± 17% -72.2% 7206 ±101% numa-vmstat.node2.nr_active_anon 25891 ± 18% -72.0% 7251 ±100% numa-vmstat.node2.nr_anon_pages 25914 ± 17% -72.2% 7206 ±101% numa-vmstat.node2.nr_zone_active_anon 11051 ±127% +301.0% 44309 ± 42% numa-vmstat.node3.nr_active_anon 36227 ± 40% -66.7% 12049 ± 93% numa-vmstat.node3.nr_free_cma 0.33 ±141% +25000.0% 83.67 ± 81% numa-vmstat.node3.nr_inactive_file 8739 +12.2% 9806 ± 8% numa-vmstat.node3.nr_kernel_stack 696.67 ±137% +3299.7% 23684 ± 71% numa-vmstat.node3.nr_shmem 3609 ± 6% +122.9% 8044 ± 42% numa-vmstat.node3.nr_slab_reclaimable 61799 ± 4% -7.8% 56960 numa-vmstat.node3.nr_unevictable 11053 ±127% +301.4% 44361 ± 42% numa-vmstat.node3.nr_zone_active_anon 0.33 ±141% +25000.0% 83.67 ± 81% numa-vmstat.node3.nr_zone_inactive_file 61799 ± 4% -7.8% 56960 numa-vmstat.node3.nr_zone_unevictable 217951 ± 8% +280.8% 829976 ± 65% numa-vmstat.node3.numa_hit 91303 ± 19% +689.3% 720647 ± 77% numa-vmstat.node3.numa_local 126648 -13.7% 109329 ± 13% numa-vmstat.node3.numa_other 8.54 -0.1 8.40 perf-profile.calltrace.cycles-pp.dequeue_task_fair.__schedule.schedule.pipe_wait.pipe_read 5.04 -0.1 4.94 perf-profile.calltrace.cycles-pp.__switch_to.read 3.43 -0.1 3.35 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.write 2.77 -0.1 2.72 perf-profile.calltrace.cycles-pp.reweight_entity.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.autoremove_wake_function 1.99 -0.0 1.94 perf-profile.calltrace.cycles-pp.copy_page_to_iter.pipe_read.__vfs_read.vfs_read.ksys_read 0.60 ± 2% -0.0 0.57 ± 2% perf-profile.calltrace.cycles-pp.find_next_bit.cpumask_next_wrap.select_idle_sibling.select_task_rq_fair.try_to_wake_up 0.81 -0.0 0.78 perf-profile.calltrace.cycles-pp.___perf_sw_event.__schedule.schedule.pipe_wait.pipe_read 0.78 +0.0 0.80 perf-profile.calltrace.cycles-pp.__fdget_pos.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write 0.73 +0.0 0.75 perf-profile.calltrace.cycles-pp.__fget_light.__fdget_pos.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.92 +0.0 0.95 perf-profile.calltrace.cycles-pp.check_preempt_wakeup.check_preempt_curr.ttwu_do_wakeup.try_to_wake_up.autoremove_wake_function 2.11 +0.0 2.15 perf-profile.calltrace.cycles-pp.security_file_permission.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe 7.00 -0.1 6.86 perf-profile.children.cycles-pp.syscall_return_via_sysret 5.26 -0.1 5.14 perf-profile.children.cycles-pp.__switch_to 5.65 -0.1 5.56 perf-profile.children.cycles-pp.reweight_entity 2.17 -0.1 2.12 perf-profile.children.cycles-pp.copy_page_to_iter 2.94 -0.0 2.90 perf-profile.children.cycles-pp.update_cfs_group 3.11 -0.0 3.07 perf-profile.children.cycles-pp.pick_next_task_fair 2.59 -0.0 2.55 perf-profile.children.cycles-pp.load_new_mm_cr3 1.92 -0.0 1.88 perf-profile.children.cycles-pp._raw_spin_lock_irqsave 1.11 -0.0 1.08 ± 2% perf-profile.children.cycles-pp.find_next_bit 0.59 -0.0 0.56 perf-profile.children.cycles-pp.finish_task_switch 0.14 ± 15% -0.0 0.11 ± 16% perf-profile.children.cycles-pp.write@plt 1.21 -0.0 1.18 perf-profile.children.cycles-pp.set_next_entity 0.85 -0.0 0.82 perf-profile.children.cycles-pp.___perf_sw_event 0.13 ± 3% -0.0 0.11 ± 4% perf-profile.children.cycles-pp.timespec_trunc 0.47 ± 2% -0.0 0.45 perf-profile.children.cycles-pp.anon_pipe_buf_release 0.38 ± 2% -0.0 0.36 perf-profile.children.cycles-pp.file_update_time 0.74 -0.0 0.73 perf-profile.children.cycles-pp.copyout 0.41 ± 2% -0.0 0.39 perf-profile.children.cycles-pp.copy_user_enhanced_fast_string 0.32 -0.0 0.30 perf-profile.children.cycles-pp.__x64_sys_read 0.14 -0.0 0.12 ± 3% perf-profile.children.cycles-pp.current_kernel_time64 0.91 +0.0 0.92 perf-profile.children.cycles-pp.touch_atime 0.40 +0.0 0.41 perf-profile.children.cycles-pp._cond_resched 0.18 ± 2% +0.0 0.20 perf-profile.children.cycles-pp.activate_task 0.05 +0.0 0.07 ± 6% perf-profile.children.cycles-pp.default_wake_function 0.24 +0.0 0.27 ± 3% perf-profile.children.cycles-pp.rcu_all_qs 0.60 ± 2% +0.0 0.64 ± 2% perf-profile.children.cycles-pp.update_min_vruntime 0.42 ± 4% +0.0 0.46 ± 4% perf-profile.children.cycles-pp.probe_sched_switch 1.33 +0.0 1.38 perf-profile.children.cycles-pp.__fget_light 0.53 ± 2% +0.1 0.58 perf-profile.children.cycles-pp.entry_SYSCALL_64_stage2 0.31 +0.1 0.36 ± 2% perf-profile.children.cycles-pp.generic_pipe_buf_confirm 4.35 +0.1 4.41 perf-profile.children.cycles-pp.switch_mm_irqs_off 2.52 +0.1 2.58 perf-profile.children.cycles-pp.selinux_file_permission 0.00 +0.1 0.07 ± 11% perf-profile.children.cycles-pp.hrtick_update 7.00 -0.1 6.86 perf-profile.self.cycles-pp.syscall_return_via_sysret 5.26 -0.1 5.14 perf-profile.self.cycles-pp.__switch_to 0.29 -0.1 0.19 ± 2% perf-profile.self.cycles-pp.ksys_read 1.49 -0.1 1.43 perf-profile.self.cycles-pp.dequeue_task_fair 2.41 -0.1 2.35 perf-profile.self.cycles-pp.__schedule 1.46 -0.0 1.41 perf-profile.self.cycles-pp.select_task_rq_fair 2.94 -0.0 2.90 perf-profile.self.cycles-pp.update_cfs_group 0.44 -0.0 0.40 perf-profile.self.cycles-pp.dequeue_entity 0.48 -0.0 0.44 perf-profile.self.cycles-pp.finish_task_switch 2.59 -0.0 2.55 perf-profile.self.cycles-pp.load_new_mm_cr3 1.11 -0.0 1.08 ± 2% perf-profile.self.cycles-pp.find_next_bit 1.91 -0.0 1.88 perf-profile.self.cycles-pp._raw_spin_lock_irqsave 0.78 -0.0 0.75 perf-profile.self.cycles-pp.___perf_sw_event 0.14 ± 15% -0.0 0.11 ± 16% perf-profile.self.cycles-pp.write@plt 0.37 -0.0 0.35 ± 2% perf-profile.self.cycles-pp.__wake_up_common_lock 0.20 ± 2% -0.0 0.17 ± 2% perf-profile.self.cycles-pp.__fdget_pos 0.47 ± 2% -0.0 0.44 perf-profile.self.cycles-pp.anon_pipe_buf_release 0.87 -0.0 0.85 perf-profile.self.cycles-pp.copy_user_generic_unrolled 0.13 ± 3% -0.0 0.11 ± 4% perf-profile.self.cycles-pp.timespec_trunc 0.41 ± 2% -0.0 0.39 perf-profile.self.cycles-pp.copy_user_enhanced_fast_string 0.38 -0.0 0.36 perf-profile.self.cycles-pp.__wake_up_common 0.32 -0.0 0.30 perf-profile.self.cycles-pp.__x64_sys_read 0.14 ± 3% -0.0 0.12 ± 3% perf-profile.self.cycles-pp.current_kernel_time64 0.30 -0.0 0.28 perf-profile.self.cycles-pp.set_next_entity 0.28 ± 3% +0.0 0.30 perf-profile.self.cycles-pp._cond_resched 0.18 ± 2% +0.0 0.20 perf-profile.self.cycles-pp.activate_task 0.17 ± 2% +0.0 0.19 perf-profile.self.cycles-pp.__might_fault 0.05 +0.0 0.07 ± 6% perf-profile.self.cycles-pp.default_wake_function 0.17 ± 2% +0.0 0.20 perf-profile.self.cycles-pp.ttwu_do_activate 0.66 +0.0 0.69 perf-profile.self.cycles-pp.write 0.24 +0.0 0.27 ± 3% perf-profile.self.cycles-pp.rcu_all_qs 0.67 +0.0 0.70 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe 0.60 ± 2% +0.0 0.64 ± 2% perf-profile.self.cycles-pp.update_min_vruntime 0.42 ± 4% +0.0 0.46 ± 4% perf-profile.self.cycles-pp.probe_sched_switch 1.33 +0.0 1.37 perf-profile.self.cycles-pp.__fget_light 1.61 +0.0 1.66 perf-profile.self.cycles-pp.pipe_read 0.53 ± 2% +0.1 0.58 perf-profile.self.cycles-pp.entry_SYSCALL_64_stage2 0.31 +0.1 0.36 ± 2% perf-profile.self.cycles-pp.generic_pipe_buf_confirm 1.04 +0.1 1.11 perf-profile.self.cycles-pp.pipe_write 0.00 +0.1 0.07 ± 11% perf-profile.self.cycles-pp.hrtick_update 2.00 +0.1 2.08 perf-profile.self.cycles-pp.switch_mm_irqs_off ========================================================================================= tbox_group/testcase/rootfs/kconfig/compiler/nr_task/thp_enabled/test/cpufreq_governor: lkp-skl-4sp1/will-it-scale/debian-x86_64-2018-04-03.cgz/x86_64-rhel-7.2/gcc-7/100%/never/page_fault3/performance commit: ba98a1cdad71d259a194461b3a61471b49b14df1 a7a8993bfe3ccb54ad468b9f1799649e4ad1ff12 ba98a1cdad71d259 a7a8993bfe3ccb54ad468b9f17 ---------------- -------------------------- fail:runs %reproduction fail:runs | | | 1:3 -33% :3 dmesg.WARNING:stack_going_in_the_wrong_direction?ip=file_update_time/0x :3 33% 1:3 stderr.mount.nfs:Connection_timed_out 34:3 -401% 22:3 perf-profile.calltrace.cycles-pp.error_entry.testcase 17:3 -207% 11:3 perf-profile.calltrace.cycles-pp.sync_regs.error_entry.testcase 34:3 -404% 22:3 perf-profile.children.cycles-pp.error_entry 0:3 -2% 0:3 perf-profile.children.cycles-pp.error_exit 16:3 -196% 11:3 perf-profile.self.cycles-pp.error_entry 0:3 -2% 0:3 perf-profile.self.cycles-pp.error_exit %stddev %change %stddev \ | \ 467454 -1.8% 459251 will-it-scale.per_process_ops 10856 ± 4% -23.1% 8344 ± 7% will-it-scale.per_thread_ops 118134 ± 2% +11.7% 131943 will-it-scale.time.involuntary_context_switches 6.277e+08 ± 4% -23.1% 4.827e+08 ± 7% will-it-scale.time.minor_page_faults 7406 +5.8% 7839 will-it-scale.time.percent_of_cpu_this_job_got 44526 +5.8% 47106 will-it-scale.time.system_time 7351468 ± 5% -18.3% 6009014 ± 7% will-it-scale.time.voluntary_context_switches 91835846 -2.2% 89778599 will-it-scale.workload 2534640 +4.3% 2643005 ± 2% interrupts.CAL:Function_call_interrupts 2819 ± 5% +22.9% 3464 ± 18% kthread_noise.total_time 30273 ± 4% -12.7% 26415 ± 5% vmstat.system.cs 1.52 ± 2% +15.2% 1.75 ± 2% irq_exception_noise.__do_page_fault.99th 296.67 ± 12% -36.7% 187.67 ± 12% irq_exception_noise.softirq_time 230900 ± 3% +30.3% 300925 ± 3% meminfo.Inactive 230184 ± 3% +30.4% 300180 ± 3% meminfo.Inactive(anon) 11.62 ± 3% -2.2 9.40 ± 5% mpstat.cpu.idle% 0.00 ± 14% -0.0 0.00 ± 4% mpstat.cpu.iowait% 7992174 -11.1% 7101976 ± 3% softirqs.RCU 4973624 ± 2% -12.9% 4333370 ± 2% softirqs.SCHED 118134 ± 2% +11.7% 131943 time.involuntary_context_switches 6.277e+08 ± 4% -23.1% 4.827e+08 ± 7% time.minor_page_faults 7406 +5.8% 7839 time.percent_of_cpu_this_job_got 44526 +5.8% 47106 time.system_time 7351468 ± 5% -18.3% 6009014 ± 7% time.voluntary_context_switches 2.702e+09 ± 5% -16.7% 2.251e+09 ± 7% cpuidle.C1E.time 6834329 ± 5% -15.8% 5756243 ± 7% cpuidle.C1E.usage 1.046e+10 ± 3% -19.8% 8.389e+09 ± 4% cpuidle.C6.time 13961845 ± 3% -19.3% 11265555 ± 4% cpuidle.C6.usage 1309307 ± 7% -14.8% 1116168 ± 8% cpuidle.POLL.time 19774 ± 6% -13.7% 17063 ± 7% cpuidle.POLL.usage 2523 ± 4% -11.1% 2243 ± 4% slabinfo.biovec-64.active_objs 2523 ± 4% -11.1% 2243 ± 4% slabinfo.biovec-64.num_objs 2610 ± 8% -33.7% 1731 ± 22% slabinfo.dmaengine-unmap-16.active_objs 2610 ± 8% -33.7% 1731 ± 22% slabinfo.dmaengine-unmap-16.num_objs 5118 ± 17% -22.6% 3962 ± 9% slabinfo.eventpoll_pwq.active_objs 5118 ± 17% -22.6% 3962 ± 9% slabinfo.eventpoll_pwq.num_objs 4583 ± 3% -14.0% 3941 ± 4% slabinfo.sock_inode_cache.active_objs 4583 ± 3% -14.0% 3941 ± 4% slabinfo.sock_inode_cache.num_objs 1933 +2.6% 1984 turbostat.Avg_MHz 6832021 ± 5% -15.8% 5754156 ± 7% turbostat.C1E 2.32 ± 5% -0.4 1.94 ± 7% turbostat.C1E% 13954211 ± 3% -19.3% 11259436 ± 4% turbostat.C6 8.97 ± 3% -1.8 7.20 ± 4% turbostat.C6% 6.18 ± 4% -17.1% 5.13 ± 5% turbostat.CPU%c1 5.12 ± 3% -21.7% 4.01 ± 4% turbostat.CPU%c6 1.76 ± 2% -34.7% 1.15 ± 2% turbostat.Pkg%pc2 57314 ± 4% +30.4% 74717 ± 4% proc-vmstat.nr_inactive_anon 57319 ± 4% +30.4% 74719 ± 4% proc-vmstat.nr_zone_inactive_anon 24415 ± 19% -62.2% 9236 ± 7% proc-vmstat.numa_hint_faults 69661453 -1.8% 68405712 proc-vmstat.numa_hit 69553390 -1.8% 68297790 proc-vmstat.numa_local 8792 ± 29% -92.6% 654.33 ± 23% proc-vmstat.numa_pages_migrated 40251 ± 32% -76.5% 9474 ± 3% proc-vmstat.numa_pte_updates 69522532 -1.6% 68383074 proc-vmstat.pgalloc_normal 2.762e+10 -2.2% 2.701e+10 proc-vmstat.pgfault 68825100 -1.5% 67772256 proc-vmstat.pgfree 8792 ± 29% -92.6% 654.33 ± 23% proc-vmstat.pgmigrate_success 57992 ± 6% +56.2% 90591 ± 3% numa-meminfo.node0.Inactive 57916 ± 6% +56.3% 90513 ± 3% numa-meminfo.node0.Inactive(anon) 37285 ± 12% +36.0% 50709 ± 5% numa-meminfo.node0.SReclaimable 110971 ± 8% +22.7% 136209 ± 8% numa-meminfo.node0.Slab 23601 ± 55% +559.5% 155651 ± 36% numa-meminfo.node1.AnonPages 62484 ± 12% +17.5% 73417 ± 3% numa-meminfo.node1.Inactive 62323 ± 12% +17.2% 73023 ± 4% numa-meminfo.node1.Inactive(anon) 109714 ± 63% -85.6% 15832 ± 96% numa-meminfo.node2.AnonPages 52236 ± 13% +22.7% 64074 ± 3% numa-meminfo.node2.Inactive 51922 ± 12% +23.2% 63963 ± 3% numa-meminfo.node2.Inactive(anon) 60241 ± 11% +21.9% 73442 ± 8% numa-meminfo.node3.Inactive 60077 ± 12% +22.0% 73279 ± 8% numa-meminfo.node3.Inactive(anon) 14093 ± 6% +55.9% 21977 ± 3% numa-vmstat.node0.nr_inactive_anon 9321 ± 12% +36.0% 12675 ± 5% numa-vmstat.node0.nr_slab_reclaimable 14090 ± 6% +56.0% 21977 ± 3% numa-vmstat.node0.nr_zone_inactive_anon 5900 ± 55% +559.4% 38909 ± 36% numa-vmstat.node1.nr_anon_pages 15413 ± 12% +14.8% 17688 ± 4% numa-vmstat.node1.nr_inactive_anon 15413 ± 12% +14.8% 17688 ± 4% numa-vmstat.node1.nr_zone_inactive_anon 27430 ± 63% -85.6% 3960 ± 96% numa-vmstat.node2.nr_anon_pages 12928 ± 12% +20.0% 15508 ± 3% numa-vmstat.node2.nr_inactive_anon 12927 ± 12% +20.0% 15507 ± 3% numa-vmstat.node2.nr_zone_inactive_anon 6229 ± 10% +117.5% 13547 ± 44% numa-vmstat.node3 14669 ± 11% +19.6% 17537 ± 7% numa-vmstat.node3.nr_inactive_anon 14674 ± 11% +19.5% 17541 ± 7% numa-vmstat.node3.nr_zone_inactive_anon 24617 ±141% -100.0% 0.00 latency_stats.avg.io_schedule.nfs_lock_and_join_requests.nfs_updatepage.nfs_write_end.generic_perform_write.nfs_file_write.__vfs_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe 5049 ±105% -99.4% 28.33 ± 82% latency_stats.avg.call_rwsem_down_write_failed.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe 152457 ± 27% +233.6% 508656 ± 92% latency_stats.avg.max 0.00 +3.9e+107% 390767 ±141% latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_getattr.__nfs_revalidate_inode.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_openat 24617 ±141% -100.0% 0.00 latency_stats.max.io_schedule.nfs_lock_and_join_requests.nfs_updatepage.nfs_write_end.generic_perform_write.nfs_file_write.__vfs_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe 4240 ±141% -100.0% 0.00 latency_stats.max.call_rwsem_down_write_failed.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe 8565 ± 70% -99.1% 80.33 ±115% latency_stats.max.call_rwsem_down_write_failed.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe 204835 ± 6% +457.6% 1142244 ±114% latency_stats.max.max 0.00 +5.1e+105% 5057 ±141% latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_access.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_openat.do_filp_open 0.00 +1e+108% 995083 ±141% latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_getattr.__nfs_revalidate_inode.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_openat 13175 ± 4% -100.0% 0.00 latency_stats.sum.io_schedule.__lock_page_or_retry.filemap_fault.__do_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault 24617 ±141% -100.0% 0.00 latency_stats.sum.io_schedule.nfs_lock_and_join_requests.nfs_updatepage.nfs_write_end.generic_perform_write.nfs_file_write.__vfs_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe 4260 ±141% -100.0% 0.00 latency_stats.sum.call_rwsem_down_write_failed.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe 8640 ± 70% -97.5% 216.33 ±108% latency_stats.sum.call_rwsem_down_write_failed.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe 6673 ± 89% -92.8% 477.67 ± 74% latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_getattr.__nfs_revalidate_inode.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat 0.00 +4.2e+105% 4228 ±130% latency_stats.sum.io_schedule.__lock_page_killable.__lock_page_or_retry.filemap_fault.__do_fault.handle_pte_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault 0.00 +7.5e+105% 7450 ± 98% latency_stats.sum.io_schedule.__lock_page_or_retry.filemap_fault.__do_fault.handle_pte_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault 0.00 +1.3e+106% 13050 ±141% latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_access.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_openat.do_filp_open 0.00 +1.5e+110% 1.508e+08 ±141% latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_getattr.__nfs_revalidate_inode.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_openat 0.97 -0.0 0.94 perf-stat.branch-miss-rate% 1.329e+11 -2.6% 1.294e+11 perf-stat.branch-misses 2.254e+11 -1.9% 2.21e+11 perf-stat.cache-references 18308779 ± 4% -12.8% 15969618 ± 5% perf-stat.context-switches 3.20 +1.8% 3.26 perf-stat.cpi 2.233e+14 +2.7% 2.293e+14 perf-stat.cpu-cycles 4.01 -0.2 3.83 perf-stat.dTLB-store-miss-rate% 4.51e+11 -2.2% 4.41e+11 perf-stat.dTLB-store-misses 1.08e+13 +2.6% 1.109e+13 perf-stat.dTLB-stores 3.158e+10 ± 5% +16.8% 3.689e+10 ± 2% perf-stat.iTLB-load-misses 2214 ± 5% -13.8% 1907 ± 2% perf-stat.instructions-per-iTLB-miss 0.31 -1.8% 0.31 perf-stat.ipc 2.762e+10 -2.2% 2.701e+10 perf-stat.minor-faults 1.535e+10 -11.2% 1.362e+10 perf-stat.node-loads 9.75 +1.1 10.89 perf-stat.node-store-miss-rate% 3.012e+09 +10.6% 3.332e+09 ± 2% perf-stat.node-store-misses 2.787e+10 -2.2% 2.725e+10 perf-stat.node-stores 2.762e+10 -2.2% 2.701e+10 perf-stat.page-faults 759458 +3.2% 783404 perf-stat.path-length 246.39 ± 15% -20.4% 196.12 ± 6% sched_debug.cfs_rq:/.load_avg.max 0.21 ± 3% +9.0% 0.23 ± 4% sched_debug.cfs_rq:/.nr_running.stddev 16.64 ± 27% +61.0% 26.79 ± 17% sched_debug.cfs_rq:/.nr_spread_over.max 75.15 -14.4% 64.30 ± 4% sched_debug.cfs_rq:/.util_avg.stddev 178.80 ± 3% +25.4% 224.12 ± 7% sched_debug.cfs_rq:/.util_est_enqueued.avg 1075 ± 5% -12.3% 943.36 ± 2% sched_debug.cfs_rq:/.util_est_enqueued.max 2093630 ± 27% -36.1% 1337941 ± 16% sched_debug.cpu.avg_idle.max 297057 ± 11% +37.8% 409294 ± 14% sched_debug.cpu.avg_idle.min 293240 ± 55% -62.3% 110571 ± 13% sched_debug.cpu.avg_idle.stddev 770075 ± 9% -19.3% 621136 ± 12% sched_debug.cpu.max_idle_balance_cost.max 48919 ± 46% -66.9% 16190 ± 81% sched_debug.cpu.max_idle_balance_cost.stddev 21716 ± 5% -16.8% 18061 ± 7% sched_debug.cpu.nr_switches.min 21519 ± 5% -17.7% 17700 ± 7% sched_debug.cpu.sched_count.min 10586 ± 5% -18.1% 8669 ± 7% sched_debug.cpu.sched_goidle.avg 14183 ± 3% -17.6% 11693 ± 5% sched_debug.cpu.sched_goidle.max 10322 ± 5% -18.6% 8407 ± 7% sched_debug.cpu.sched_goidle.min 400.99 ± 8% -13.0% 348.75 ± 3% sched_debug.cpu.sched_goidle.stddev 5459 ± 8% +10.0% 6006 ± 3% sched_debug.cpu.ttwu_local.avg 8.47 ± 42% +345.8% 37.73 ± 77% sched_debug.rt_rq:/.rt_time.max 0.61 ± 42% +343.0% 2.72 ± 77% sched_debug.rt_rq:/.rt_time.stddev 91.98 -30.9 61.11 ± 70% perf-profile.calltrace.cycles-pp.testcase 9.05 -9.1 0.00 perf-profile.calltrace.cycles-pp.__do_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault 8.91 -8.9 0.00 perf-profile.calltrace.cycles-pp.shmem_fault.__do_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault 8.06 -8.1 0.00 perf-profile.calltrace.cycles-pp.shmem_getpage_gfp.shmem_fault.__do_fault.__handle_mm_fault.handle_mm_fault 7.59 -7.6 0.00 perf-profile.calltrace.cycles-pp.find_lock_entry.shmem_getpage_gfp.shmem_fault.__do_fault.__handle_mm_fault 7.44 -7.4 0.00 perf-profile.calltrace.cycles-pp.finish_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault 7.28 -7.3 0.00 perf-profile.calltrace.cycles-pp.alloc_set_pte.finish_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault 5.31 -5.3 0.00 perf-profile.calltrace.cycles-pp.page_add_file_rmap.alloc_set_pte.finish_fault.__handle_mm_fault.handle_mm_fault 8.08 -2.8 5.30 ± 70% perf-profile.calltrace.cycles-pp.native_irq_return_iret.testcase 5.95 -2.1 3.83 ± 70% perf-profile.calltrace.cycles-pp.find_get_entry.find_lock_entry.shmem_getpage_gfp.shmem_fault.__do_fault 5.95 -2.0 3.93 ± 70% perf-profile.calltrace.cycles-pp.swapgs_restore_regs_and_return_to_usermode.testcase 3.10 -1.1 2.01 ± 70% perf-profile.calltrace.cycles-pp.__perf_sw_event.__do_page_fault.do_page_fault.page_fault.testcase 2.36 -0.8 1.55 ± 70% perf-profile.calltrace.cycles-pp.___perf_sw_event.__perf_sw_event.__do_page_fault.do_page_fault.page_fault 1.08 -0.4 0.70 ± 70% perf-profile.calltrace.cycles-pp.do_page_fault.testcase 0.82 -0.3 0.54 ± 70% perf-profile.calltrace.cycles-pp.trace_graph_entry.do_page_fault.testcase 0.77 -0.3 0.50 ± 70% perf-profile.calltrace.cycles-pp.ftrace_graph_caller.__do_page_fault.do_page_fault.page_fault.testcase 0.59 -0.2 0.37 ± 70% perf-profile.calltrace.cycles-pp.down_read_trylock.__do_page_fault.do_page_fault.page_fault.testcase 91.98 -30.9 61.11 ± 70% perf-profile.children.cycles-pp.testcase 9.14 -3.2 5.99 ± 70% perf-profile.children.cycles-pp.__do_fault 8.20 -2.8 5.40 ± 70% perf-profile.children.cycles-pp.shmem_getpage_gfp 8.08 -2.8 5.31 ± 70% perf-profile.children.cycles-pp.native_irq_return_iret 6.08 -2.2 3.92 ± 70% perf-profile.children.cycles-pp.find_get_entry 6.08 -2.1 3.96 ± 70% perf-profile.children.cycles-pp.sync_regs 5.95 -2.0 3.93 ± 70% perf-profile.children.cycles-pp.swapgs_restore_regs_and_return_to_usermode 4.12 -1.4 2.73 ± 70% perf-profile.children.cycles-pp.ftrace_graph_caller 3.65 -1.2 2.42 ± 70% perf-profile.children.cycles-pp.prepare_ftrace_return 3.18 -1.1 2.07 ± 70% perf-profile.children.cycles-pp.__perf_sw_event 2.34 -0.8 1.52 ± 70% perf-profile.children.cycles-pp.fault_dirty_shared_page 0.80 -0.3 0.50 ± 70% perf-profile.children.cycles-pp._raw_spin_lock 0.76 -0.3 0.50 ± 70% perf-profile.children.cycles-pp.tlb_flush_mmu_free 0.61 -0.2 0.39 ± 70% perf-profile.children.cycles-pp.down_read_trylock 0.48 ± 2% -0.2 0.28 ± 70% perf-profile.children.cycles-pp.pmd_devmap_trans_unstable 0.26 ± 6% -0.1 0.15 ± 71% perf-profile.children.cycles-pp.ktime_get 0.20 ± 2% -0.1 0.12 ± 70% perf-profile.children.cycles-pp.perf_exclude_event 0.22 ± 2% -0.1 0.13 ± 70% perf-profile.children.cycles-pp._cond_resched 0.17 -0.1 0.11 ± 70% perf-profile.children.cycles-pp.page_rmapping 0.13 -0.1 0.07 ± 70% perf-profile.children.cycles-pp.rcu_all_qs 0.07 -0.0 0.04 ± 70% perf-profile.children.cycles-pp.ftrace_lookup_ip 22.36 -7.8 14.59 ± 70% perf-profile.self.cycles-pp.testcase 8.08 -2.8 5.31 ± 70% perf-profile.self.cycles-pp.native_irq_return_iret 6.08 -2.1 3.96 ± 70% perf-profile.self.cycles-pp.sync_regs 5.81 -2.0 3.84 ± 70% perf-profile.self.cycles-pp.swapgs_restore_regs_and_return_to_usermode 3.27 -1.6 1.65 ± 70% perf-profile.self.cycles-pp.__handle_mm_fault 3.79 -1.4 2.36 ± 70% perf-profile.self.cycles-pp.find_get_entry 3.80 -1.3 2.53 ± 70% perf-profile.self.cycles-pp.trace_graph_entry 1.10 -0.5 0.57 ± 70% perf-profile.self.cycles-pp.alloc_set_pte 1.24 -0.4 0.81 ± 70% perf-profile.self.cycles-pp.shmem_fault 0.80 -0.3 0.50 ± 70% perf-profile.self.cycles-pp._raw_spin_lock 0.81 -0.3 0.51 ± 70% perf-profile.self.cycles-pp.find_lock_entry 0.80 ± 2% -0.3 0.51 ± 70% perf-profile.self.cycles-pp.__perf_sw_event 0.61 -0.2 0.38 ± 70% perf-profile.self.cycles-pp.down_read_trylock 0.60 -0.2 0.39 ± 70% perf-profile.self.cycles-pp.shmem_getpage_gfp 0.48 -0.2 0.27 ± 70% perf-profile.self.cycles-pp.pmd_devmap_trans_unstable 0.47 -0.2 0.30 ± 70% perf-profile.self.cycles-pp.file_update_time 0.34 -0.1 0.22 ± 70% perf-profile.self.cycles-pp.do_page_fault 0.22 ± 4% -0.1 0.11 ± 70% perf-profile.self.cycles-pp.__do_fault 0.25 ± 5% -0.1 0.14 ± 71% perf-profile.self.cycles-pp.ktime_get 0.21 ± 2% -0.1 0.12 ± 70% perf-profile.self.cycles-pp.finish_fault 0.23 ± 2% -0.1 0.14 ± 70% perf-profile.self.cycles-pp.fault_dirty_shared_page 0.22 ± 2% -0.1 0.14 ± 70% perf-profile.self.cycles-pp.prepare_exit_to_usermode 0.20 ± 2% -0.1 0.12 ± 70% perf-profile.self.cycles-pp.perf_exclude_event 0.16 -0.1 0.10 ± 70% perf-profile.self.cycles-pp._cond_resched 0.13 -0.1 0.07 ± 70% perf-profile.self.cycles-pp.rcu_all_qs 0.07 -0.0 0.04 ± 70% perf-profile.self.cycles-pp.ftrace_lookup_ip ========================================================================================= tbox_group/testcase/rootfs/kconfig/compiler/nr_task/thp_enabled/test/cpufreq_governor: lkp-skl-4sp1/will-it-scale/debian-x86_64-2018-04-03.cgz/x86_64-rhel-7.2/gcc-7/100%/always/context_switch1/performance commit: ba98a1cdad71d259a194461b3a61471b49b14df1 a7a8993bfe3ccb54ad468b9f1799649e4ad1ff12 ba98a1cdad71d259 a7a8993bfe3ccb54ad468b9f17 ---------------- -------------------------- fail:runs %reproduction fail:runs | | | :3 33% 1:3 dmesg.WARNING:at#for_ip_interrupt_entry/0x :3 33% 1:3 dmesg.WARNING:at#for_ip_ret_from_intr/0x :3 67% 2:3 kmsg.pstore:crypto_comp_decompress_failed,ret= :3 67% 2:3 kmsg.pstore:decompression_failed %stddev %change %stddev \ | \ 223910 -1.3% 220930 will-it-scale.per_process_ops 233722 -1.0% 231288 will-it-scale.per_thread_ops 6.001e+08 ± 13% +31.4% 7.887e+08 ± 4% will-it-scale.time.involuntary_context_switches 18003 ± 4% +10.9% 19956 will-it-scale.time.minor_page_faults 1.29e+10 -2.5% 1.258e+10 will-it-scale.time.voluntary_context_switches 87865617 -1.2% 86826277 will-it-scale.workload 2880329 ± 2% +5.4% 3034904 interrupts.CAL:Function_call_interrupts 7695018 -23.3% 5905066 ± 8% meminfo.DirectMap2M 0.00 ± 39% -0.0 0.00 ± 78% mpstat.cpu.iowait% 4621 ± 12% +13.4% 5241 proc-vmstat.numa_hint_faults_local 715714 +27.6% 913142 ± 13% softirqs.SCHED 515653 ± 6% -20.0% 412650 ± 15% turbostat.C1 43643516 -1.2% 43127031 vmstat.system.cs 2893393 ± 4% -23.6% 2210524 ± 10% cpuidle.C1.time 518051 ± 6% -19.9% 415081 ± 15% cpuidle.C1.usage 23.10 +22.9% 28.38 ± 9% boot-time.boot 18.38 +23.2% 22.64 ± 12% boot-time.dhcp 5216 +5.0% 5478 ± 2% boot-time.idle 963.76 ± 44% +109.7% 2021 ± 34% irq_exception_noise.__do_page_fault.sum 6.33 ± 14% +726.3% 52.33 ± 62% irq_exception_noise.irq_time 56524 ± 7% -18.8% 45915 ± 4% irq_exception_noise.softirq_time 6.001e+08 ± 13% +31.4% 7.887e+08 ± 4% time.involuntary_context_switches 18003 ± 4% +10.9% 19956 time.minor_page_faults 1.29e+10 -2.5% 1.258e+10 time.voluntary_context_switches 1386 ± 7% +15.4% 1600 ± 11% slabinfo.scsi_sense_cache.active_objs 1386 ± 7% +15.4% 1600 ± 11% slabinfo.scsi_sense_cache.num_objs 1427 ± 5% -8.9% 1299 ± 2% slabinfo.task_group.active_objs 1427 ± 5% -8.9% 1299 ± 2% slabinfo.task_group.num_objs 65519 ± 12% +20.6% 79014 ± 16% numa-meminfo.node0.SUnreclaim 8484 -11.9% 7475 ± 7% numa-meminfo.node1.KernelStack 9264 ± 26% -33.7% 6146 ± 7% numa-meminfo.node1.Mapped 2138 ± 61% +373.5% 10127 ± 92% numa-meminfo.node3.Inactive 2059 ± 61% +387.8% 10046 ± 93% numa-meminfo.node3.Inactive(anon) 16379 ± 12% +20.6% 19752 ± 16% numa-vmstat.node0.nr_slab_unreclaimable 8483 -11.9% 7474 ± 7% numa-vmstat.node1.nr_kernel_stack 6250 ± 29% -42.8% 3575 ± 24% numa-vmstat.node2 3798 ± 17% +63.7% 6218 ± 5% numa-vmstat.node3 543.00 ± 61% +368.1% 2541 ± 91% numa-vmstat.node3.nr_inactive_anon 543.33 ± 61% +367.8% 2541 ± 91% numa-vmstat.node3.nr_zone_inactive_anon 4.138e+13 -1.1% 4.09e+13 perf-stat.branch-instructions 6.569e+11 -2.0% 6.441e+11 perf-stat.branch-misses 2.645e+10 -1.2% 2.613e+10 perf-stat.context-switches 1.21 +1.2% 1.23 perf-stat.cpi 153343 ± 2% -12.1% 134776 perf-stat.cpu-migrations 5.966e+13 -1.3% 5.889e+13 perf-stat.dTLB-loads 3.736e+13 -1.2% 3.69e+13 perf-stat.dTLB-stores 5.85 ± 15% +8.8 14.67 ± 9% perf-stat.iTLB-load-miss-rate% 3.736e+09 ± 17% +161.3% 9.76e+09 ± 11% perf-stat.iTLB-load-misses 5.987e+10 -5.4% 5.667e+10 perf-stat.iTLB-loads 2.079e+14 -1.2% 2.054e+14 perf-stat.instructions 57547 ± 18% -62.9% 21340 ± 11% perf-stat.instructions-per-iTLB-miss 0.82 -1.2% 0.81 perf-stat.ipc 27502531 ± 8% +9.5% 30122136 ± 3% perf-stat.node-store-misses 1449 ± 27% -34.6% 948.85 sched_debug.cfs_rq:/.load.min 319416 ±115% -188.5% -282549 sched_debug.cfs_rq:/.spread0.avg 657044 ± 55% -88.3% 76887 ± 23% sched_debug.cfs_rq:/.spread0.max -1525243 +54.6% -2357898 sched_debug.cfs_rq:/.spread0.min 101614 ± 6% +30.6% 132713 ± 19% sched_debug.cpu.avg_idle.stddev 11.54 ± 41% -61.2% 4.48 sched_debug.cpu.cpu_load[1].avg 1369 ± 67% -98.5% 20.67 ± 48% sched_debug.cpu.cpu_load[1].max 99.29 ± 67% -97.6% 2.35 ± 26% sched_debug.cpu.cpu_load[1].stddev 9.58 ± 38% -55.2% 4.29 sched_debug.cpu.cpu_load[2].avg 1024 ± 68% -98.5% 15.27 ± 36% sched_debug.cpu.cpu_load[2].max 74.51 ± 67% -97.3% 1.99 ± 15% sched_debug.cpu.cpu_load[2].stddev 7.37 ± 29% -42.0% 4.28 sched_debug.cpu.cpu_load[3].avg 600.58 ± 68% -97.9% 12.48 ± 20% sched_debug.cpu.cpu_load[3].max 43.98 ± 66% -95.8% 1.83 ± 5% sched_debug.cpu.cpu_load[3].stddev 5.95 ± 19% -28.1% 4.28 sched_debug.cpu.cpu_load[4].avg 325.39 ± 67% -96.4% 11.67 ± 10% sched_debug.cpu.cpu_load[4].max 24.19 ± 65% -92.5% 1.81 ± 3% sched_debug.cpu.cpu_load[4].stddev 907.23 ± 4% -14.1% 779.70 ± 10% sched_debug.cpu.nr_load_updates.stddev 0.00 ± 83% +122.5% 0.00 sched_debug.rt_rq:/.rt_time.min 8.49 ± 2% -0.3 8.21 ± 2% perf-profile.calltrace.cycles-pp.dequeue_task_fair.__schedule.schedule.pipe_wait.pipe_read 57.28 -0.3 57.01 perf-profile.calltrace.cycles-pp.read 5.06 -0.2 4.85 perf-profile.calltrace.cycles-pp.select_task_rq_fair.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock 4.98 -0.2 4.78 perf-profile.calltrace.cycles-pp.__switch_to.read 3.55 -0.2 3.39 ± 2% perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.read 2.72 -0.1 2.60 perf-profile.calltrace.cycles-pp.reweight_entity.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.autoremove_wake_function 2.67 -0.1 2.57 ± 2% perf-profile.calltrace.cycles-pp.reweight_entity.dequeue_task_fair.__schedule.schedule.pipe_wait 3.40 -0.1 3.31 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.write 3.77 -0.1 3.68 perf-profile.calltrace.cycles-pp.select_idle_sibling.select_task_rq_fair.try_to_wake_up.autoremove_wake_function.__wake_up_common 1.95 -0.1 1.88 perf-profile.calltrace.cycles-pp.copy_page_to_iter.pipe_read.__vfs_read.vfs_read.ksys_read 2.19 -0.1 2.13 perf-profile.calltrace.cycles-pp.__switch_to_asm.read 1.30 -0.1 1.25 perf-profile.calltrace.cycles-pp.update_curr.reweight_entity.enqueue_task_fair.ttwu_do_activate.try_to_wake_up 1.27 -0.1 1.22 ± 2% perf-profile.calltrace.cycles-pp.update_curr.reweight_entity.dequeue_task_fair.__schedule.schedule 2.29 -0.0 2.24 perf-profile.calltrace.cycles-pp.load_new_mm_cr3.switch_mm_irqs_off.__schedule.schedule.pipe_wait 0.96 -0.0 0.92 perf-profile.calltrace.cycles-pp.__calc_delta.update_curr.reweight_entity.dequeue_task_fair.__schedule 0.85 -0.0 0.81 ± 3% perf-profile.calltrace.cycles-pp.cpumask_next_wrap.select_idle_sibling.select_task_rq_fair.try_to_wake_up.autoremove_wake_function 1.63 -0.0 1.59 perf-profile.calltrace.cycles-pp.native_write_msr.read 0.72 -0.0 0.69 perf-profile.calltrace.cycles-pp.copyout.copy_page_to_iter.pipe_read.__vfs_read.vfs_read 0.65 ± 2% -0.0 0.62 perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock 0.61 -0.0 0.58 ± 2% perf-profile.calltrace.cycles-pp.find_next_bit.cpumask_next_wrap.select_idle_sibling.select_task_rq_fair.try_to_wake_up 0.88 -0.0 0.85 perf-profile.calltrace.cycles-pp.touch_atime.pipe_read.__vfs_read.vfs_read.ksys_read 0.80 -0.0 0.77 ± 2% perf-profile.calltrace.cycles-pp.___perf_sw_event.__schedule.schedule.pipe_wait.pipe_read 0.82 -0.0 0.79 perf-profile.calltrace.cycles-pp.prepare_to_wait.pipe_wait.pipe_read.__vfs_read.vfs_read 0.72 -0.0 0.70 perf-profile.calltrace.cycles-pp.mutex_lock.pipe_write.__vfs_write.vfs_write.ksys_write 0.56 ± 2% -0.0 0.53 perf-profile.calltrace.cycles-pp.update_rq_clock.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock 0.83 -0.0 0.81 perf-profile.calltrace.cycles-pp.__wake_up_common_lock.pipe_read.__vfs_read.vfs_read.ksys_read 42.40 +0.3 42.69 perf-profile.calltrace.cycles-pp.write 31.80 +0.4 32.18 perf-profile.calltrace.cycles-pp.__vfs_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe 24.35 +0.5 24.84 perf-profile.calltrace.cycles-pp.pipe_wait.pipe_read.__vfs_read.vfs_read.ksys_read 20.36 +0.6 20.92 ± 2% perf-profile.calltrace.cycles-pp.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.pipe_write 22.01 +0.6 22.58 perf-profile.calltrace.cycles-pp.schedule.pipe_wait.pipe_read.__vfs_read.vfs_read 21.87 +0.6 22.46 perf-profile.calltrace.cycles-pp.__schedule.schedule.pipe_wait.pipe_read.__vfs_read 3.15 ± 11% +1.0 4.12 ± 14% perf-profile.calltrace.cycles-pp.ttwu_do_wakeup.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock 1.07 ± 34% +1.1 2.12 ± 31% perf-profile.calltrace.cycles-pp.tracing_record_taskinfo_sched_switch.__schedule.schedule.pipe_wait.pipe_read 0.66 ± 75% +1.1 1.72 ± 37% perf-profile.calltrace.cycles-pp.trace_save_cmdline.tracing_record_taskinfo.ttwu_do_wakeup.try_to_wake_up.autoremove_wake_function 0.75 ± 74% +1.1 1.88 ± 34% perf-profile.calltrace.cycles-pp.tracing_record_taskinfo.ttwu_do_wakeup.try_to_wake_up.autoremove_wake_function.__wake_up_common 0.69 ± 76% +1.2 1.85 ± 36% perf-profile.calltrace.cycles-pp.trace_save_cmdline.tracing_record_taskinfo_sched_switch.__schedule.schedule.pipe_wait 8.73 ± 2% -0.3 8.45 perf-profile.children.cycles-pp.dequeue_task_fair 57.28 -0.3 57.01 perf-profile.children.cycles-pp.read 6.95 -0.2 6.70 perf-profile.children.cycles-pp.syscall_return_via_sysret 5.57 -0.2 5.35 perf-profile.children.cycles-pp.reweight_entity 5.26 -0.2 5.05 perf-profile.children.cycles-pp.select_task_rq_fair 5.19 -0.2 4.99 perf-profile.children.cycles-pp.__switch_to 4.90 -0.2 4.73 ± 2% perf-profile.children.cycles-pp.update_curr 1.27 -0.1 1.13 ± 8% perf-profile.children.cycles-pp.fsnotify 3.92 -0.1 3.83 perf-profile.children.cycles-pp.select_idle_sibling 2.01 -0.1 1.93 perf-profile.children.cycles-pp.__calc_delta 2.14 -0.1 2.06 perf-profile.children.cycles-pp.copy_page_to_iter 1.58 -0.1 1.51 perf-profile.children.cycles-pp._raw_spin_unlock_irqrestore 2.90 -0.1 2.84 perf-profile.children.cycles-pp.update_cfs_group 1.93 -0.1 1.87 perf-profile.children.cycles-pp._raw_spin_lock_irqsave 2.35 -0.1 2.29 perf-profile.children.cycles-pp.__switch_to_asm 1.33 -0.1 1.27 ± 3% perf-profile.children.cycles-pp.cpumask_next_wrap 2.57 -0.1 2.52 perf-profile.children.cycles-pp.load_new_mm_cr3 1.53 -0.1 1.47 ± 2% perf-profile.children.cycles-pp.__fdget_pos 1.11 -0.0 1.07 ± 2% perf-profile.children.cycles-pp.find_next_bit 1.18 -0.0 1.14 perf-profile.children.cycles-pp.update_rq_clock 0.88 -0.0 0.83 perf-profile.children.cycles-pp.copy_user_generic_unrolled 1.70 -0.0 1.65 perf-profile.children.cycles-pp.native_write_msr 0.97 -0.0 0.93 ± 2% perf-profile.children.cycles-pp.account_entity_dequeue 0.59 -0.0 0.56 perf-profile.children.cycles-pp.finish_task_switch 0.91 -0.0 0.88 perf-profile.children.cycles-pp.touch_atime 0.69 -0.0 0.65 perf-profile.children.cycles-pp.account_entity_enqueue 2.13 -0.0 2.09 perf-profile.children.cycles-pp.mutex_lock 0.32 ± 3% -0.0 0.29 ± 4% perf-profile.children.cycles-pp.__sb_start_write 0.84 -0.0 0.81 ± 2% perf-profile.children.cycles-pp.___perf_sw_event 0.89 -0.0 0.87 perf-profile.children.cycles-pp.prepare_to_wait 0.73 -0.0 0.71 perf-profile.children.cycles-pp.copyout 0.31 ± 2% -0.0 0.28 ± 3% perf-profile.children.cycles-pp.__list_del_entry_valid 0.46 ± 2% -0.0 0.44 perf-profile.children.cycles-pp.anon_pipe_buf_release 0.38 -0.0 0.36 ± 3% perf-profile.children.cycles-pp.idle_cpu 0.32 -0.0 0.30 ± 2% perf-profile.children.cycles-pp.__x64_sys_read 0.21 ± 2% -0.0 0.20 ± 2% perf-profile.children.cycles-pp.deactivate_task 0.13 -0.0 0.12 ± 4% perf-profile.children.cycles-pp.timespec_trunc 0.09 -0.0 0.08 perf-profile.children.cycles-pp.iov_iter_init 0.08 -0.0 0.07 perf-profile.children.cycles-pp.native_load_tls 0.11 ± 4% +0.0 0.12 perf-profile.children.cycles-pp.tick_sched_timer 0.08 ± 5% +0.0 0.10 ± 4% perf-profile.children.cycles-pp.finish_wait 0.38 ± 2% +0.0 0.40 ± 2% perf-profile.children.cycles-pp.file_update_time 0.31 +0.0 0.33 ± 2% perf-profile.children.cycles-pp.smp_apic_timer_interrupt 0.24 ± 3% +0.0 0.26 ± 3% perf-profile.children.cycles-pp.rcu_all_qs 0.39 +0.0 0.41 perf-profile.children.cycles-pp._cond_resched 0.05 +0.0 0.07 ± 6% perf-profile.children.cycles-pp.default_wake_function 0.23 ± 2% +0.0 0.26 ± 3% perf-profile.children.cycles-pp.current_time 0.30 +0.0 0.35 ± 2% perf-profile.children.cycles-pp.generic_pipe_buf_confirm 0.52 +0.1 0.58 perf-profile.children.cycles-pp.entry_SYSCALL_64_stage2 0.00 +0.1 0.08 ± 5% perf-profile.children.cycles-pp.hrtick_update 42.40 +0.3 42.69 perf-profile.children.cycles-pp.write 31.86 +0.4 32.26 perf-profile.children.cycles-pp.__vfs_read 24.40 +0.5 24.89 perf-profile.children.cycles-pp.pipe_wait 20.40 +0.6 20.96 ± 2% perf-profile.children.cycles-pp.try_to_wake_up 22.30 +0.6 22.89 perf-profile.children.cycles-pp.schedule 22.22 +0.6 22.84 perf-profile.children.cycles-pp.__schedule 0.99 ± 36% +0.9 1.94 ± 32% perf-profile.children.cycles-pp.tracing_record_taskinfo 3.30 ± 10% +1.0 4.27 ± 13% perf-profile.children.cycles-pp.ttwu_do_wakeup 1.14 ± 31% +1.1 2.24 ± 29% perf-profile.children.cycles-pp.tracing_record_taskinfo_sched_switch 1.59 ± 46% +2.0 3.60 ± 36% perf-profile.children.cycles-pp.trace_save_cmdline 6.95 -0.2 6.70 perf-profile.self.cycles-pp.syscall_return_via_sysret 5.19 -0.2 4.99 perf-profile.self.cycles-pp.__switch_to 1.27 -0.1 1.12 ± 8% perf-profile.self.cycles-pp.fsnotify 1.49 -0.1 1.36 perf-profile.self.cycles-pp.select_task_rq_fair 2.47 -0.1 2.37 ± 2% perf-profile.self.cycles-pp.reweight_entity 0.29 -0.1 0.19 ± 2% perf-profile.self.cycles-pp.ksys_read 1.50 -0.1 1.42 perf-profile.self.cycles-pp._raw_spin_unlock_irqrestore 2.01 -0.1 1.93 perf-profile.self.cycles-pp.__calc_delta 1.93 -0.1 1.86 perf-profile.self.cycles-pp._raw_spin_lock_irqsave 1.47 -0.1 1.40 perf-profile.self.cycles-pp.dequeue_task_fair 2.90 -0.1 2.84 perf-profile.self.cycles-pp.update_cfs_group 1.29 -0.1 1.23 perf-profile.self.cycles-pp.do_syscall_64 2.57 -0.1 2.52 perf-profile.self.cycles-pp.load_new_mm_cr3 2.28 -0.1 2.23 perf-profile.self.cycles-pp.__switch_to_asm 1.80 -0.1 1.75 perf-profile.self.cycles-pp.select_idle_sibling 1.11 -0.0 1.07 ± 2% perf-profile.self.cycles-pp.find_next_bit 0.87 -0.0 0.83 perf-profile.self.cycles-pp.copy_user_generic_unrolled 0.43 -0.0 0.39 ± 2% perf-profile.self.cycles-pp.dequeue_entity 1.70 -0.0 1.65 perf-profile.self.cycles-pp.native_write_msr 0.92 -0.0 0.88 ± 2% perf-profile.self.cycles-pp.account_entity_dequeue 0.48 -0.0 0.44 perf-profile.self.cycles-pp.finish_task_switch 0.77 -0.0 0.74 perf-profile.self.cycles-pp.___perf_sw_event 0.66 -0.0 0.63 perf-profile.self.cycles-pp.account_entity_enqueue 0.46 ± 2% -0.0 0.43 ± 2% perf-profile.self.cycles-pp.anon_pipe_buf_release 0.32 ± 3% -0.0 0.29 ± 4% perf-profile.self.cycles-pp.__sb_start_write 0.31 ± 2% -0.0 0.28 ± 3% perf-profile.self.cycles-pp.__list_del_entry_valid 0.38 -0.0 0.36 ± 3% perf-profile.self.cycles-pp.idle_cpu 0.19 ± 4% -0.0 0.17 ± 2% perf-profile.self.cycles-pp.__fdget_pos 0.50 -0.0 0.48 perf-profile.self.cycles-pp.__atime_needs_update 0.23 ± 2% -0.0 0.21 ± 3% perf-profile.self.cycles-pp.touch_atime 0.31 -0.0 0.30 perf-profile.self.cycles-pp.__x64_sys_read 0.21 ± 2% -0.0 0.20 ± 2% perf-profile.self.cycles-pp.deactivate_task 0.21 ± 2% -0.0 0.19 perf-profile.self.cycles-pp.check_preempt_curr 0.40 -0.0 0.39 perf-profile.self.cycles-pp.autoremove_wake_function 0.40 -0.0 0.38 perf-profile.self.cycles-pp.copy_user_enhanced_fast_string 0.27 -0.0 0.26 perf-profile.self.cycles-pp.pipe_wait 0.13 -0.0 0.12 ± 4% perf-profile.self.cycles-pp.timespec_trunc 0.22 ± 2% -0.0 0.20 ± 2% perf-profile.self.cycles-pp.put_prev_entity 0.09 -0.0 0.08 perf-profile.self.cycles-pp.iov_iter_init 0.08 -0.0 0.07 perf-profile.self.cycles-pp.native_load_tls 0.11 -0.0 0.10 perf-profile.self.cycles-pp.schedule 0.12 ± 4% +0.0 0.13 perf-profile.self.cycles-pp.copyin 0.08 ± 5% +0.0 0.10 ± 4% perf-profile.self.cycles-pp.finish_wait 0.18 +0.0 0.20 ± 2% perf-profile.self.cycles-pp.ttwu_do_activate 0.28 ± 2% +0.0 0.30 ± 2% perf-profile.self.cycles-pp._cond_resched 0.24 ± 3% +0.0 0.26 ± 3% perf-profile.self.cycles-pp.rcu_all_qs 0.05 +0.0 0.07 ± 6% perf-profile.self.cycles-pp.default_wake_function 0.08 ± 14% +0.0 0.11 ± 14% perf-profile.self.cycles-pp.tracing_record_taskinfo_sched_switch 0.51 +0.0 0.55 ± 4% perf-profile.self.cycles-pp.vfs_write 0.30 +0.0 0.35 ± 2% perf-profile.self.cycles-pp.generic_pipe_buf_confirm 0.52 +0.1 0.58 perf-profile.self.cycles-pp.entry_SYSCALL_64_stage2 0.00 +0.1 0.08 ± 5% perf-profile.self.cycles-pp.hrtick_update 1.97 +0.1 2.07 ± 2% perf-profile.self.cycles-pp.switch_mm_irqs_off 1.59 ± 46% +2.0 3.60 ± 36% perf-profile.self.cycles-pp.trace_save_cmdline ========================================================================================= tbox_group/testcase/rootfs/kconfig/compiler/nr_task/thp_enabled/test/cpufreq_governor: lkp-skl-4sp1/will-it-scale/debian-x86_64-2018-04-03.cgz/x86_64-rhel-7.2/gcc-7/100%/never/brk1/performance commit: ba98a1cdad71d259a194461b3a61471b49b14df1 a7a8993bfe3ccb54ad468b9f1799649e4ad1ff12 ba98a1cdad71d259 a7a8993bfe3ccb54ad468b9f17 ---------------- -------------------------- fail:runs %reproduction fail:runs | | | :3 33% 1:3 kmsg.pstore:crypto_comp_decompress_failed,ret= :3 33% 1:3 kmsg.pstore:decompression_failed %stddev %change %stddev \ | \ 997317 -2.0% 977778 will-it-scale.per_process_ops 957.00 -7.9% 881.00 ± 3% will-it-scale.per_thread_ops 18.42 ± 3% -8.2% 16.90 will-it-scale.time.user_time 1.917e+08 -2.0% 1.879e+08 will-it-scale.workload 18.42 ± 3% -8.2% 16.90 time.user_time 0.30 ± 11% -36.7% 0.19 ± 11% turbostat.Pkg%pc2 57539 ± 51% +140.6% 138439 ± 31% meminfo.CmaFree 410877 ± 11% -22.1% 320082 ± 22% meminfo.DirectMap4k 343575 ± 27% +71.3% 588703 ± 31% numa-numastat.node0.local_node 374176 ± 24% +63.3% 611007 ± 27% numa-numastat.node0.numa_hit 1056347 ± 4% -39.9% 634843 ± 38% numa-numastat.node3.local_node 1060682 ± 4% -39.0% 646862 ± 35% numa-numastat.node3.numa_hit 14383 ± 51% +140.6% 34608 ± 31% proc-vmstat.nr_free_cma 179.00 +2.4% 183.33 proc-vmstat.nr_inactive_file 179.00 +2.4% 183.33 proc-vmstat.nr_zone_inactive_file 564483 ± 3% -38.0% 350064 ± 36% proc-vmstat.pgalloc_movable 1811959 +10.8% 2008488 ± 5% proc-vmstat.pgalloc_normal 7153 ± 42% -94.0% 431.33 ±119% latency_stats.max.pipe_write.__vfs_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe 6627 ±141% +380.5% 31843 ±110% latency_stats.max.call_rwsem_down_write_failed_killable.do_mprotect_pkey.__x64_sys_mprotect.do_syscall_64.entry_SYSCALL_64_after_hwframe 15244 ± 31% -99.9% 15.00 ±141% latency_stats.sum.call_rwsem_down_read_failed.__do_page_fault.do_page_fault.page_fault.__get_user_8.exit_robust_list.mm_release.do_exit.do_group_exit.get_signal.do_signal.exit_to_usermode_loop 4301 ±117% -83.7% 700.33 ± 6% latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_getattr.__nfs_revalidate_inode.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat 12153 ± 28% -83.1% 2056 ± 70% latency_stats.sum.pipe_write.__vfs_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe 6772 ±141% +1105.8% 81665 ±127% latency_stats.sum.call_rwsem_down_write_failed_killable.do_mprotect_pkey.__x64_sys_mprotect.do_syscall_64.entry_SYSCALL_64_after_hwframe 2.465e+13 -1.3% 2.434e+13 perf-stat.branch-instructions 2.691e+11 -2.1% 2.635e+11 perf-stat.branch-misses 3.402e+13 -1.4% 3.355e+13 perf-stat.dTLB-loads 1.694e+13 +1.4% 1.718e+13 perf-stat.dTLB-stores 1.75 ± 50% +4.7 6.45 ± 11% perf-stat.iTLB-load-miss-rate% 4.077e+08 ± 48% +232.3% 1.355e+09 ± 11% perf-stat.iTLB-load-misses 2.31e+10 ± 2% -14.9% 1.965e+10 ± 3% perf-stat.iTLB-loads 1.163e+14 -1.6% 1.144e+14 perf-stat.instructions 346171 ± 36% -75.3% 85575 ± 11% perf-stat.instructions-per-iTLB-miss 6.174e+08 ± 2% -9.5% 5.589e+08 perf-stat.node-store-misses 595.00 ± 10% +31.4% 782.00 ± 3% slabinfo.Acpi-State.active_objs 595.00 ± 10% +31.4% 782.00 ± 3% slabinfo.Acpi-State.num_objs 2831 ± 3% -14.0% 2434 ± 5% slabinfo.avtab_node.active_objs 2831 ± 3% -14.0% 2434 ± 5% slabinfo.avtab_node.num_objs 934.00 -10.9% 832.33 ± 5% slabinfo.inotify_inode_mark.active_objs 934.00 -10.9% 832.33 ± 5% slabinfo.inotify_inode_mark.num_objs 1232 ± 4% +13.4% 1397 ± 6% slabinfo.nsproxy.active_objs 1232 ± 4% +13.4% 1397 ± 6% slabinfo.nsproxy.num_objs 499.67 ± 12% +24.8% 623.67 ± 10% slabinfo.secpath_cache.active_objs 499.67 ± 12% +24.8% 623.67 ± 10% slabinfo.secpath_cache.num_objs 31393 ± 84% +220.1% 100477 ± 21% numa-meminfo.node0.Active 31393 ± 84% +220.1% 100477 ± 21% numa-meminfo.node0.Active(anon) 30013 ± 85% +232.1% 99661 ± 21% numa-meminfo.node0.AnonPages 21603 ± 34% -85.0% 3237 ±100% numa-meminfo.node0.Inactive 21528 ± 34% -85.0% 3237 ±100% numa-meminfo.node0.Inactive(anon) 10247 ± 35% -46.4% 5495 numa-meminfo.node0.Mapped 35388 ± 14% -41.6% 20670 ± 15% numa-meminfo.node0.SReclaimable 22911 ± 29% -82.3% 4057 ± 84% numa-meminfo.node0.Shmem 117387 ± 9% -22.5% 90986 ± 12% numa-meminfo.node0.Slab 68863 ± 67% +77.7% 122351 ± 13% numa-meminfo.node1.Active 68863 ± 67% +77.7% 122351 ± 13% numa-meminfo.node1.Active(anon) 228376 +22.3% 279406 ± 17% numa-meminfo.node1.FilePages 1481 ±116% +1062.1% 17218 ± 39% numa-meminfo.node1.Inactive 1481 ±116% +1062.0% 17216 ± 39% numa-meminfo.node1.Inactive(anon) 6593 ± 2% +11.7% 7367 ± 3% numa-meminfo.node1.KernelStack 596227 ± 8% +18.0% 703748 ± 4% numa-meminfo.node1.MemUsed 15298 ± 12% +88.5% 28843 ± 36% numa-meminfo.node1.SReclaimable 52718 ± 9% +21.0% 63810 ± 11% numa-meminfo.node1.SUnreclaim 1808 ± 97% +2723.8% 51054 ± 97% numa-meminfo.node1.Shmem 68017 ± 5% +36.2% 92654 ± 18% numa-meminfo.node1.Slab 125541 ± 29% -64.9% 44024 ± 98% numa-meminfo.node3.Active 125137 ± 29% -65.0% 43823 ± 98% numa-meminfo.node3.Active(anon) 93173 ± 25% -87.8% 11381 ± 20% numa-meminfo.node3.AnonPages 9150 ± 5% -9.3% 8301 ± 8% numa-meminfo.node3.KernelStack 7848 ± 84% +220.0% 25118 ± 21% numa-vmstat.node0.nr_active_anon 7503 ± 85% +232.1% 24914 ± 21% numa-vmstat.node0.nr_anon_pages 5381 ± 34% -85.0% 809.00 ±100% numa-vmstat.node0.nr_inactive_anon 2559 ± 35% -46.4% 1372 numa-vmstat.node0.nr_mapped 5727 ± 29% -82.3% 1014 ± 84% numa-vmstat.node0.nr_shmem 8846 ± 14% -41.6% 5167 ± 15% numa-vmstat.node0.nr_slab_reclaimable 7848 ± 84% +220.0% 25118 ± 21% numa-vmstat.node0.nr_zone_active_anon 5381 ± 34% -85.0% 809.00 ±100% numa-vmstat.node0.nr_zone_inactive_anon 4821 ± 2% +30.3% 6283 ± 15% numa-vmstat.node1 17215 ± 67% +77.7% 30591 ± 13% numa-vmstat.node1.nr_active_anon 57093 +22.3% 69850 ± 17% numa-vmstat.node1.nr_file_pages 370.00 ±116% +1061.8% 4298 ± 39% numa-vmstat.node1.nr_inactive_anon 6593 ± 2% +11.7% 7366 ± 3% numa-vmstat.node1.nr_kernel_stack 451.67 ± 97% +2725.6% 12762 ± 97% numa-vmstat.node1.nr_shmem 3824 ± 12% +88.6% 7211 ± 36% numa-vmstat.node1.nr_slab_reclaimable 13179 ± 9% +21.0% 15952 ± 11% numa-vmstat.node1.nr_slab_unreclaimable 17215 ± 67% +77.7% 30591 ± 13% numa-vmstat.node1.nr_zone_active_anon 370.00 ±116% +1061.8% 4298 ± 39% numa-vmstat.node1.nr_zone_inactive_anon 364789 ± 12% +62.8% 593926 ± 34% numa-vmstat.node1.numa_hit 239539 ± 19% +95.4% 468113 ± 43% numa-vmstat.node1.numa_local 71.00 ± 28% +42.3% 101.00 numa-vmstat.node2.nr_mlock 31285 ± 29% -65.0% 10960 ± 98% numa-vmstat.node3.nr_active_anon 23292 ± 25% -87.8% 2844 ± 19% numa-vmstat.node3.nr_anon_pages 14339 ± 52% +141.1% 34566 ± 32% numa-vmstat.node3.nr_free_cma 9151 ± 5% -9.3% 8299 ± 8% numa-vmstat.node3.nr_kernel_stack 31305 ± 29% -64.9% 10975 ± 98% numa-vmstat.node3.nr_zone_active_anon 930131 ± 3% -35.9% 596006 ± 34% numa-vmstat.node3.numa_hit 836455 ± 3% -40.9% 493947 ± 44% numa-vmstat.node3.numa_local 75182 ± 58% -83.8% 12160 ± 2% sched_debug.cfs_rq:/.load.max 6.65 ± 5% -10.6% 5.94 ± 6% sched_debug.cfs_rq:/.load_avg.avg 0.16 ± 7% +22.6% 0.20 ± 12% sched_debug.cfs_rq:/.nr_running.stddev 5.58 ± 24% +427.7% 29.42 ± 93% sched_debug.cfs_rq:/.nr_spread_over.max 0.54 ± 15% +306.8% 2.19 ± 86% sched_debug.cfs_rq:/.nr_spread_over.stddev 1.05 ± 25% -65.1% 0.37 ± 71% sched_debug.cfs_rq:/.removed.load_avg.avg 9.62 ± 11% -50.7% 4.74 ± 70% sched_debug.cfs_rq:/.removed.load_avg.stddev 48.70 ± 25% -65.1% 17.02 ± 71% sched_debug.cfs_rq:/.removed.runnable_sum.avg 444.31 ± 11% -50.7% 219.26 ± 70% sched_debug.cfs_rq:/.removed.runnable_sum.stddev 0.47 ± 13% -60.9% 0.19 ± 71% sched_debug.cfs_rq:/.removed.util_avg.avg 4.47 ± 4% -46.5% 2.39 ± 70% sched_debug.cfs_rq:/.removed.util_avg.stddev 1.64 ± 7% +22.1% 2.00 ± 13% sched_debug.cfs_rq:/.runnable_load_avg.stddev 74653 ± 59% -84.4% 11676 sched_debug.cfs_rq:/.runnable_weight.max -119169 -491.3% 466350 ± 27% sched_debug.cfs_rq:/.spread0.avg 517161 ± 30% +145.8% 1271292 ± 23% sched_debug.cfs_rq:/.spread0.max 624.79 ± 5% -14.2% 535.76 ± 7% sched_debug.cfs_rq:/.util_est_enqueued.avg 247.91 ± 32% -99.8% 0.48 ± 8% sched_debug.cfs_rq:/.util_est_enqueued.min 179704 ± 3% +30.4% 234297 ± 16% sched_debug.cpu.avg_idle.stddev 1.56 ± 9% +24.4% 1.94 ± 14% sched_debug.cpu.cpu_load[0].stddev 1.50 ± 6% +27.7% 1.91 ± 14% sched_debug.cpu.cpu_load[1].stddev 1.45 ± 3% +30.8% 1.90 ± 14% sched_debug.cpu.cpu_load[2].stddev 1.43 ± 3% +36.1% 1.95 ± 11% sched_debug.cpu.cpu_load[3].stddev 1.55 ± 7% +43.5% 2.22 ± 7% sched_debug.cpu.cpu_load[4].stddev 10004 ± 3% -11.6% 8839 ± 3% sched_debug.cpu.curr->pid.avg 1146 ± 26% +52.2% 1745 ± 7% sched_debug.cpu.curr->pid.min 3162 ± 6% +25.4% 3966 ± 11% sched_debug.cpu.curr->pid.stddev 403738 ± 3% -11.7% 356696 ± 7% sched_debug.cpu.nr_switches.max 0.08 ± 21% +78.2% 0.14 ± 14% sched_debug.cpu.nr_uninterruptible.avg 404435 ± 3% -11.8% 356732 ± 7% sched_debug.cpu.sched_count.max 4.17 -0.3 3.87 perf-profile.calltrace.cycles-pp.kmem_cache_alloc.do_brk_flags.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe 2.40 -0.2 2.17 perf-profile.calltrace.cycles-pp.vma_compute_subtree_gap.__vma_link_rb.vma_link.do_brk_flags.__x64_sys_brk 7.58 -0.2 7.36 perf-profile.calltrace.cycles-pp.perf_event_mmap.do_brk_flags.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe 15.00 -0.2 14.81 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.brk 7.83 -0.2 7.66 perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.do_munmap.__x64_sys_brk.do_syscall_64 28.66 -0.1 28.51 perf-profile.calltrace.cycles-pp.do_brk_flags.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe.brk 2.15 -0.1 2.03 perf-profile.calltrace.cycles-pp.vma_compute_subtree_gap.do_munmap.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe 1.07 -0.1 0.99 perf-profile.calltrace.cycles-pp.memcpy_erms.strlcpy.perf_event_mmap.do_brk_flags.__x64_sys_brk 1.03 -0.1 0.95 perf-profile.calltrace.cycles-pp.kmem_cache_free.remove_vma.do_munmap.__x64_sys_brk.do_syscall_64 7.33 -0.1 7.25 perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.do_munmap.__x64_sys_brk 0.76 -0.1 0.69 perf-profile.calltrace.cycles-pp.__vm_enough_memory.do_brk_flags.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe 11.85 -0.1 11.77 perf-profile.calltrace.cycles-pp.unmap_region.do_munmap.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe 1.64 -0.1 1.57 perf-profile.calltrace.cycles-pp.strlcpy.perf_event_mmap.do_brk_flags.__x64_sys_brk.do_syscall_64 1.06 -0.1 0.99 perf-profile.calltrace.cycles-pp.__indirect_thunk_start.brk 0.73 -0.1 0.67 perf-profile.calltrace.cycles-pp.sync_mm_rss.unmap_page_range.unmap_vmas.unmap_region.do_munmap 4.59 -0.1 4.52 perf-profile.calltrace.cycles-pp.security_vm_enough_memory_mm.do_brk_flags.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe 2.82 -0.1 2.76 perf-profile.calltrace.cycles-pp.selinux_vm_enough_memory.security_vm_enough_memory_mm.do_brk_flags.__x64_sys_brk.do_syscall_64 2.89 -0.1 2.84 perf-profile.calltrace.cycles-pp.down_write_killable.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe.brk 3.37 -0.1 3.32 perf-profile.calltrace.cycles-pp.get_unmapped_area.do_brk_flags.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe 1.99 -0.0 1.94 perf-profile.calltrace.cycles-pp.cred_has_capability.selinux_vm_enough_memory.security_vm_enough_memory_mm.do_brk_flags.__x64_sys_brk 2.32 -0.0 2.27 perf-profile.calltrace.cycles-pp.perf_iterate_sb.perf_event_mmap.do_brk_flags.__x64_sys_brk.do_syscall_64 1.88 -0.0 1.84 perf-profile.calltrace.cycles-pp.security_mmap_addr.get_unmapped_area.do_brk_flags.__x64_sys_brk.do_syscall_64 0.77 -0.0 0.73 perf-profile.calltrace.cycles-pp._raw_spin_lock.unmap_page_range.unmap_vmas.unmap_region.do_munmap 1.62 -0.0 1.59 perf-profile.calltrace.cycles-pp.memset_erms.kmem_cache_alloc.do_brk_flags.__x64_sys_brk.do_syscall_64 0.81 -0.0 0.79 perf-profile.calltrace.cycles-pp.___might_sleep.down_write_killable.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.66 -0.0 0.64 perf-profile.calltrace.cycles-pp.arch_get_unmapped_area_topdown.brk 0.72 +0.0 0.74 perf-profile.calltrace.cycles-pp.do_munmap.brk 0.90 +0.0 0.93 perf-profile.calltrace.cycles-pp.___might_sleep.unmap_page_range.unmap_vmas.unmap_region.do_munmap 4.40 +0.1 4.47 perf-profile.calltrace.cycles-pp.find_vma.do_munmap.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe 1.96 +0.1 2.09 perf-profile.calltrace.cycles-pp.vmacache_find.find_vma.do_munmap.__x64_sys_brk.do_syscall_64 0.52 ± 2% +0.2 0.68 perf-profile.calltrace.cycles-pp.__vma_link_rb.brk 0.35 ± 70% +0.2 0.54 ± 2% perf-profile.calltrace.cycles-pp.find_vma.brk 2.20 +0.3 2.50 perf-profile.calltrace.cycles-pp.remove_vma.do_munmap.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe 64.62 +0.3 64.94 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.brk 60.53 +0.4 60.92 perf-profile.calltrace.cycles-pp.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe.brk 63.20 +0.4 63.60 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.brk 3.73 +0.5 4.26 perf-profile.calltrace.cycles-pp.vma_link.do_brk_flags.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.00 +0.6 0.56 perf-profile.calltrace.cycles-pp.free_pgtables.unmap_region.do_munmap.__x64_sys_brk.do_syscall_64 24.54 +0.6 25.14 perf-profile.calltrace.cycles-pp.do_munmap.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe.brk 0.00 +0.6 0.64 perf-profile.calltrace.cycles-pp.put_vma.remove_vma.do_munmap.__x64_sys_brk.do_syscall_64 0.71 +0.6 1.36 perf-profile.calltrace.cycles-pp.__vma_rb_erase.do_munmap.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.00 +0.7 0.70 perf-profile.calltrace.cycles-pp._raw_write_lock.__vma_rb_erase.do_munmap.__x64_sys_brk.do_syscall_64 3.10 +0.7 3.82 perf-profile.calltrace.cycles-pp.__vma_link_rb.vma_link.do_brk_flags.__x64_sys_brk.do_syscall_64 0.00 +0.8 0.76 perf-profile.calltrace.cycles-pp._raw_write_lock.__vma_link_rb.vma_link.do_brk_flags.__x64_sys_brk 0.00 +0.8 0.85 perf-profile.calltrace.cycles-pp.__vma_merge.do_brk_flags.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe 5.09 -0.5 4.62 perf-profile.children.cycles-pp.vma_compute_subtree_gap 4.54 -0.3 4.21 perf-profile.children.cycles-pp.kmem_cache_alloc 8.11 -0.2 7.89 perf-profile.children.cycles-pp.perf_event_mmap 8.05 -0.2 7.85 perf-profile.children.cycles-pp.unmap_vmas 15.01 -0.2 14.81 perf-profile.children.cycles-pp.syscall_return_via_sysret 29.20 -0.1 29.06 perf-profile.children.cycles-pp.do_brk_flags 1.11 -0.1 1.00 perf-profile.children.cycles-pp.kmem_cache_free 12.28 -0.1 12.17 perf-profile.children.cycles-pp.unmap_region 7.83 -0.1 7.74 perf-profile.children.cycles-pp.unmap_page_range 0.87 ± 3% -0.1 0.79 perf-profile.children.cycles-pp.__vm_enough_memory 1.29 -0.1 1.22 perf-profile.children.cycles-pp.__indirect_thunk_start 1.81 -0.1 1.74 perf-profile.children.cycles-pp.strlcpy 4.65 -0.1 4.58 perf-profile.children.cycles-pp.security_vm_enough_memory_mm 3.08 -0.1 3.02 perf-profile.children.cycles-pp.down_write_killable 2.88 -0.1 2.82 perf-profile.children.cycles-pp.selinux_vm_enough_memory 0.73 -0.1 0.67 perf-profile.children.cycles-pp.sync_mm_rss 3.65 -0.1 3.59 perf-profile.children.cycles-pp.get_unmapped_area 2.26 -0.1 2.20 perf-profile.children.cycles-pp.cred_has_capability 1.12 -0.1 1.07 perf-profile.children.cycles-pp.memcpy_erms 0.39 -0.0 0.35 perf-profile.children.cycles-pp.__rb_insert_augmented 2.52 -0.0 2.48 perf-profile.children.cycles-pp.perf_iterate_sb 2.13 -0.0 2.09 perf-profile.children.cycles-pp.security_mmap_addr 0.55 ± 2% -0.0 0.52 perf-profile.children.cycles-pp.unmap_single_vma 1.62 -0.0 1.59 perf-profile.children.cycles-pp.memset_erms 0.13 ± 3% -0.0 0.11 ± 4% perf-profile.children.cycles-pp.__vma_link_file 0.80 -0.0 0.77 perf-profile.children.cycles-pp._raw_spin_lock 0.43 -0.0 0.41 perf-profile.children.cycles-pp.strlen 0.07 ± 6% -0.0 0.06 ± 8% perf-profile.children.cycles-pp.should_failslab 0.43 -0.0 0.42 perf-profile.children.cycles-pp.may_expand_vm 0.15 +0.0 0.16 perf-profile.children.cycles-pp.__vma_link_list 0.45 +0.0 0.47 perf-profile.children.cycles-pp.rcu_all_qs 0.81 +0.1 0.89 perf-profile.children.cycles-pp.free_pgtables 6.35 +0.1 6.49 perf-profile.children.cycles-pp.find_vma 2.28 +0.2 2.45 perf-profile.children.cycles-pp.vmacache_find 64.66 +0.3 64.98 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe 2.42 +0.3 2.76 perf-profile.children.cycles-pp.remove_vma 61.77 +0.4 62.13 perf-profile.children.cycles-pp.__x64_sys_brk 63.40 +0.4 63.79 perf-profile.children.cycles-pp.do_syscall_64 1.27 +0.4 1.72 perf-profile.children.cycles-pp.__vma_rb_erase 4.02 +0.5 4.53 perf-profile.children.cycles-pp.vma_link 25.26 +0.6 25.89 perf-profile.children.cycles-pp.do_munmap 0.00 +0.7 0.70 perf-profile.children.cycles-pp.put_vma 3.80 +0.7 4.53 perf-profile.children.cycles-pp.__vma_link_rb 0.00 +1.2 1.24 perf-profile.children.cycles-pp.__vma_merge 0.00 +1.5 1.51 perf-profile.children.cycles-pp._raw_write_lock 5.07 -0.5 4.60 perf-profile.self.cycles-pp.vma_compute_subtree_gap 0.59 -0.2 0.38 perf-profile.self.cycles-pp.remove_vma 15.01 -0.2 14.81 perf-profile.self.cycles-pp.syscall_return_via_sysret 3.15 -0.2 2.96 perf-profile.self.cycles-pp.do_munmap 0.98 -0.1 0.87 perf-profile.self.cycles-pp.__vma_rb_erase 1.10 -0.1 0.99 perf-profile.self.cycles-pp.kmem_cache_free 0.68 -0.1 0.58 perf-profile.self.cycles-pp.__vm_enough_memory 0.42 -0.1 0.33 perf-profile.self.cycles-pp.unmap_vmas 3.62 -0.1 3.53 perf-profile.self.cycles-pp.perf_event_mmap 1.41 -0.1 1.34 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe 1.29 -0.1 1.22 perf-profile.self.cycles-pp.__indirect_thunk_start 0.73 -0.1 0.66 perf-profile.self.cycles-pp.sync_mm_rss 2.96 -0.1 2.90 perf-profile.self.cycles-pp.__x64_sys_brk 3.24 -0.1 3.19 perf-profile.self.cycles-pp.brk 1.11 -0.0 1.07 perf-profile.self.cycles-pp.memcpy_erms 0.53 ± 3% -0.0 0.49 ± 2% perf-profile.self.cycles-pp.vma_link 0.73 -0.0 0.69 perf-profile.self.cycles-pp.unmap_region 1.66 -0.0 1.61 perf-profile.self.cycles-pp.down_write_killable 0.39 -0.0 0.35 perf-profile.self.cycles-pp.__rb_insert_augmented 1.74 -0.0 1.71 perf-profile.self.cycles-pp.kmem_cache_alloc 0.55 ± 2% -0.0 0.52 perf-profile.self.cycles-pp.unmap_single_vma 1.61 -0.0 1.59 perf-profile.self.cycles-pp.memset_erms 0.80 -0.0 0.77 perf-profile.self.cycles-pp._raw_spin_lock 0.13 -0.0 0.11 ± 4% perf-profile.self.cycles-pp.__vma_link_file 0.43 -0.0 0.41 perf-profile.self.cycles-pp.strlen 0.07 ± 6% -0.0 0.06 ± 8% perf-profile.self.cycles-pp.should_failslab 0.81 -0.0 0.79 perf-profile.self.cycles-pp.tlb_finish_mmu 0.15 +0.0 0.16 perf-profile.self.cycles-pp.__vma_link_list 0.45 +0.0 0.47 perf-profile.self.cycles-pp.rcu_all_qs 0.71 +0.0 0.72 perf-profile.self.cycles-pp.strlcpy 0.51 +0.1 0.56 perf-profile.self.cycles-pp.free_pgtables 1.41 +0.1 1.48 perf-profile.self.cycles-pp.__vma_link_rb 2.27 +0.2 2.44 perf-profile.self.cycles-pp.vmacache_find 0.00 +0.7 0.69 perf-profile.self.cycles-pp.put_vma 0.00 +1.2 1.23 perf-profile.self.cycles-pp.__vma_merge 0.00 +1.5 1.50 perf-profile.self.cycles-pp._raw_write_lock ========================================================================================= tbox_group/testcase/rootfs/kconfig/compiler/nr_task/thp_enabled/test/cpufreq_governor: lkp-skl-4sp1/will-it-scale/debian-x86_64-2018-04-03.cgz/x86_64-rhel-7.2/gcc-7/100%/always/brk1/performance commit: ba98a1cdad71d259a194461b3a61471b49b14df1 a7a8993bfe3ccb54ad468b9f1799649e4ad1ff12 ba98a1cdad71d259 a7a8993bfe3ccb54ad468b9f17 ---------------- -------------------------- fail:runs %reproduction fail:runs | | | :3 33% 1:3 dmesg.WARNING:stack_going_in_the_wrong_direction?ip=schedule_tail/0x :3 33% 1:3 kmsg.DHCP/BOOTP:Reply_not_for_us_on_eth#,op[#]xid[#] %stddev %change %stddev \ | \ 998475 -2.2% 976893 will-it-scale.per_process_ops 625.87 -2.3% 611.42 will-it-scale.time.elapsed_time 625.87 -2.3% 611.42 will-it-scale.time.elapsed_time.max 8158 -1.9% 8000 will-it-scale.time.maximum_resident_set_size 18.42 ± 2% -11.9% 16.24 will-it-scale.time.user_time 34349225 ± 13% -14.5% 29371024 ± 17% will-it-scale.time.voluntary_context_switches 1.919e+08 -2.2% 1.877e+08 will-it-scale.workload 1639 ± 23% -18.4% 1337 ± 30% meminfo.Mlocked 17748 ± 82% +103.1% 36051 numa-numastat.node3.other_node 33410486 ± 14% -14.8% 28449258 ± 18% cpuidle.C1.usage 698749 ± 15% -18.0% 573307 ± 20% cpuidle.POLL.usage 3013702 ± 14% -15.1% 2559405 ± 17% softirqs.SCHED 54361293 ± 2% -19.0% 44044816 ± 2% softirqs.TIMER 33408303 ± 14% -14.9% 28447123 ± 18% turbostat.C1 0.34 ± 16% -52.0% 0.16 ± 15% turbostat.Pkg%pc2 1310 ± 74% +412.1% 6710 ± 58% irq_exception_noise.__do_page_fault.samples 3209 ± 74% +281.9% 12258 ± 53% irq_exception_noise.__do_page_fault.sum 600.67 ±132% -96.0% 24.00 ± 23% irq_exception_noise.irq_nr 99557 ± 7% -24.0% 75627 ± 7% irq_exception_noise.softirq_nr 41424 ± 9% -24.6% 31253 ± 6% irq_exception_noise.softirq_time 625.87 -2.3% 611.42 time.elapsed_time 625.87 -2.3% 611.42 time.elapsed_time.max 8158 -1.9% 8000 time.maximum_resident_set_size 18.42 ± 2% -11.9% 16.24 time.user_time 34349225 ± 13% -14.5% 29371024 ± 17% time.voluntary_context_switches 988.00 ± 8% +14.5% 1131 ± 2% slabinfo.Acpi-ParseExt.active_objs 988.00 ± 8% +14.5% 1131 ± 2% slabinfo.Acpi-ParseExt.num_objs 2384 ± 3% +21.1% 2888 ± 11% slabinfo.pool_workqueue.active_objs 2474 ± 2% +20.4% 2979 ± 11% slabinfo.pool_workqueue.num_objs 490.33 ± 10% -19.2% 396.00 ± 11% slabinfo.secpath_cache.active_objs 490.33 ± 10% -19.2% 396.00 ± 11% slabinfo.secpath_cache.num_objs 1123 ± 7% +14.2% 1282 ± 3% slabinfo.skbuff_fclone_cache.active_objs 1123 ± 7% +14.2% 1282 ± 3% slabinfo.skbuff_fclone_cache.num_objs 1.09 -0.0 1.07 perf-stat.branch-miss-rate% 2.691e+11 -2.4% 2.628e+11 perf-stat.branch-misses 71981351 ± 12% -13.8% 62013509 ± 16% perf-stat.context-switches 1.697e+13 +1.1% 1.715e+13 perf-stat.dTLB-stores 2.36 ± 29% +4.4 6.76 ± 11% perf-stat.iTLB-load-miss-rate% 5.21e+08 ± 28% +194.8% 1.536e+09 ± 10% perf-stat.iTLB-load-misses 239983 ± 24% -68.4% 75819 ± 11% perf-stat.instructions-per-iTLB-miss 3295653 ± 2% -6.3% 3088753 ± 3% perf-stat.node-stores 606239 +1.1% 612799 perf-stat.path-length 3755 ± 28% -37.5% 2346 ± 52% sched_debug.cfs_rq:/.exec_clock.stddev 10.45 ± 4% +24.3% 12.98 ± 18% sched_debug.cfs_rq:/.load_avg.stddev 6243 ± 46% -38.6% 3831 ± 78% sched_debug.cpu.load.stddev 867.80 ± 7% +25.3% 1087 ± 6% sched_debug.cpu.nr_load_updates.stddev 395898 ± 3% -11.1% 352071 ± 7% sched_debug.cpu.nr_switches.max -13.33 -21.1% -10.52 sched_debug.cpu.nr_uninterruptible.min 395674 ± 3% -11.1% 351762 ± 7% sched_debug.cpu.sched_count.max 33152 ± 4% -12.8% 28899 sched_debug.cpu.ttwu_count.min 0.03 ± 20% +77.7% 0.05 ± 15% sched_debug.rt_rq:/.rt_time.max 89523 +1.8% 91099 proc-vmstat.nr_active_anon 409.67 ± 23% -18.4% 334.33 ± 30% proc-vmstat.nr_mlock 89530 +1.8% 91117 proc-vmstat.nr_zone_active_anon 2337130 -2.2% 2286775 proc-vmstat.numa_hit 2229090 -2.3% 2178626 proc-vmstat.numa_local 8460 ± 39% -75.5% 2076 ± 53% proc-vmstat.numa_pages_migrated 28643 ± 55% -83.5% 4727 ± 58% proc-vmstat.numa_pte_updates 2695806 -1.8% 2646639 proc-vmstat.pgfault 2330191 -2.1% 2281197 proc-vmstat.pgfree 8460 ± 39% -75.5% 2076 ± 53% proc-vmstat.pgmigrate_success 237651 ± 2% +31.3% 312092 ± 16% numa-meminfo.node0.FilePages 8059 ± 2% +10.7% 8925 ± 7% numa-meminfo.node0.KernelStack 6830 ± 25% +48.8% 10164 ± 35% numa-meminfo.node0.Mapped 1612 ± 21% +70.0% 2740 ± 19% numa-meminfo.node0.PageTables 10772 ± 65% +679.4% 83962 ± 59% numa-meminfo.node0.Shmem 163195 ± 15% -36.9% 103036 ± 32% numa-meminfo.node1.Active 163195 ± 15% -36.9% 103036 ± 32% numa-meminfo.node1.Active(anon) 1730 ± 4% +33.9% 2317 ± 14% numa-meminfo.node1.PageTables 55778 ± 19% +32.5% 73910 ± 8% numa-meminfo.node1.SUnreclaim 2671 ± 16% -45.0% 1469 ± 15% numa-meminfo.node2.PageTables 61537 ± 13% -17.7% 50647 ± 3% numa-meminfo.node2.SUnreclaim 48644 ± 94% +149.8% 121499 ± 11% numa-meminfo.node3.Active 48440 ± 94% +150.4% 121295 ± 11% numa-meminfo.node3.Active(anon) 11832 ± 79% -91.5% 1008 ± 67% numa-meminfo.node3.Inactive 11597 ± 82% -93.3% 772.00 ± 82% numa-meminfo.node3.Inactive(anon) 10389 ± 32% -43.0% 5921 ± 6% numa-meminfo.node3.Mapped 33704 ± 24% -44.2% 18792 ± 15% numa-meminfo.node3.SReclaimable 104733 ± 14% -25.3% 78275 ± 8% numa-meminfo.node3.Slab 139329 ±133% -99.8% 241.67 ± 79% latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_do_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open.do_syscall_64 5403 ±139% -97.5% 137.67 ± 71% latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe 165968 ±101% -61.9% 63304 ± 58% latency_stats.avg.max 83.00 +12810.4% 10715 ±140% latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_access.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat.filename_lookup 102.67 ± 6% +18845.5% 19450 ±140% latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_getattr.__nfs_revalidate_inode.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat 136.33 ± 16% +25043.5% 34279 ±141% latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.__lookup_slow.lookup_slow.walk_component.path_lookupat.filename_lookup 18497 ±141% -100.0% 0.00 latency_stats.max.call_rwsem_down_write_failed_killable.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe 140500 ±131% -99.8% 247.00 ± 78% latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_do_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open.do_syscall_64 5403 ±139% -97.5% 137.67 ± 71% latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe 87.33 ± 5% +23963.0% 21015 ±140% latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_access.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat.filename_lookup 136.33 ± 16% +25043.5% 34279 ±141% latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.__lookup_slow.lookup_slow.walk_component.path_lookupat.filename_lookup 149.33 ± 14% +25485.9% 38208 ±141% latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_getattr.__nfs_revalidate_inode.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat 18761 ±141% -100.0% 0.00 latency_stats.sum.call_rwsem_down_write_failed_killable.vm_mmap_pgoff.ksys_mmap_pgoff.do_syscall_64.entry_SYSCALL_64_after_hwframe 23363 ±114% -100.0% 0.00 latency_stats.sum.call_rwsem_down_read_failed.__do_page_fault.do_page_fault.page_fault.__get_user_8.exit_robust_list.mm_release.do_exit.do_group_exit.get_signal.do_signal.exit_to_usermode_loop 144810 ±125% -99.8% 326.67 ± 70% latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_do_create.nfs3_proc_create.nfs_create.path_openat.do_filp_open.do_sys_open.do_syscall_64 5403 ±139% -97.5% 137.67 ± 71% latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe 59698 ± 98% -78.0% 13110 ±141% latency_stats.sum.call_rwsem_down_read_failed.do_exit.do_group_exit.get_signal.do_signal.exit_to_usermode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe 166.33 +12768.5% 21404 ±140% latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_access.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat.filename_lookup 825.00 ± 6% +18761.7% 155609 ±140% latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_getattr.__nfs_revalidate_inode.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat 136.33 ± 16% +25043.5% 34279 ±141% latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup.__lookup_slow.lookup_slow.walk_component.path_lookupat.filename_lookup 59412 ± 2% +31.3% 78021 ± 16% numa-vmstat.node0.nr_file_pages 8059 ± 2% +10.7% 8923 ± 7% numa-vmstat.node0.nr_kernel_stack 1701 ± 25% +49.1% 2536 ± 35% numa-vmstat.node0.nr_mapped 402.33 ± 21% +70.0% 684.00 ± 19% numa-vmstat.node0.nr_page_table_pages 2692 ± 65% +679.5% 20988 ± 59% numa-vmstat.node0.nr_shmem 622587 ± 36% +37.7% 857545 ± 13% numa-vmstat.node0.numa_local 40797 ± 15% -36.9% 25757 ± 32% numa-vmstat.node1.nr_active_anon 432.00 ± 4% +33.9% 578.33 ± 14% numa-vmstat.node1.nr_page_table_pages 13944 ± 19% +32.5% 18477 ± 8% numa-vmstat.node1.nr_slab_unreclaimable 40797 ± 15% -36.9% 25757 ± 32% numa-vmstat.node1.nr_zone_active_anon 625073 ± 26% +29.4% 808657 ± 18% numa-vmstat.node1.numa_hit 503969 ± 34% +39.2% 701446 ± 23% numa-vmstat.node1.numa_local 137.33 ± 40% -49.0% 70.00 ± 29% numa-vmstat.node2.nr_mlock 667.67 ± 17% -45.1% 366.33 ± 15% numa-vmstat.node2.nr_page_table_pages 15384 ± 13% -17.7% 12662 ± 3% numa-vmstat.node2.nr_slab_unreclaimable 12114 ± 94% +150.3% 30326 ± 11% numa-vmstat.node3.nr_active_anon 2887 ± 83% -93.4% 190.00 ± 82% numa-vmstat.node3.nr_inactive_anon 2632 ± 30% -39.2% 1600 ± 5% numa-vmstat.node3.nr_mapped 101.00 -30.0% 70.67 ± 29% numa-vmstat.node3.nr_mlock 8425 ± 24% -44.2% 4697 ± 15% numa-vmstat.node3.nr_slab_reclaimable 12122 ± 94% +150.3% 30346 ± 11% numa-vmstat.node3.nr_zone_active_anon 2887 ± 83% -93.4% 190.00 ± 82% numa-vmstat.node3.nr_zone_inactive_anon 106945 ± 13% +17.4% 125554 numa-vmstat.node3.numa_other 4.17 -0.3 3.82 perf-profile.calltrace.cycles-pp.kmem_cache_alloc.do_brk_flags.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe 15.02 -0.3 14.77 perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.brk 2.42 -0.2 2.18 perf-profile.calltrace.cycles-pp.vma_compute_subtree_gap.__vma_link_rb.vma_link.do_brk_flags.__x64_sys_brk 7.60 -0.2 7.39 perf-profile.calltrace.cycles-pp.perf_event_mmap.do_brk_flags.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe 7.79 -0.2 7.63 perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.do_munmap.__x64_sys_brk.do_syscall_64 0.82 ± 9% -0.1 0.68 perf-profile.calltrace.cycles-pp.__vm_enough_memory.do_brk_flags.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe 2.13 -0.1 2.00 perf-profile.calltrace.cycles-pp.vma_compute_subtree_gap.do_munmap.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe 1.05 -0.1 0.95 perf-profile.calltrace.cycles-pp.kmem_cache_free.remove_vma.do_munmap.__x64_sys_brk.do_syscall_64 7.31 -0.1 7.21 perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.do_munmap.__x64_sys_brk 0.74 -0.1 0.67 perf-profile.calltrace.cycles-pp.sync_mm_rss.unmap_page_range.unmap_vmas.unmap_region.do_munmap 1.06 -0.1 1.00 perf-profile.calltrace.cycles-pp.memcpy_erms.strlcpy.perf_event_mmap.do_brk_flags.__x64_sys_brk 3.38 -0.1 3.33 perf-profile.calltrace.cycles-pp.get_unmapped_area.do_brk_flags.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe 1.05 -0.0 1.00 ± 2% perf-profile.calltrace.cycles-pp.__indirect_thunk_start.brk 2.34 -0.0 2.29 perf-profile.calltrace.cycles-pp.perf_iterate_sb.perf_event_mmap.do_brk_flags.__x64_sys_brk.do_syscall_64 1.64 -0.0 1.59 perf-profile.calltrace.cycles-pp.strlcpy.perf_event_mmap.do_brk_flags.__x64_sys_brk.do_syscall_64 1.89 -0.0 1.86 perf-profile.calltrace.cycles-pp.security_mmap_addr.get_unmapped_area.do_brk_flags.__x64_sys_brk.do_syscall_64 0.76 -0.0 0.73 perf-profile.calltrace.cycles-pp._raw_spin_lock.unmap_page_range.unmap_vmas.unmap_region.do_munmap 0.57 ± 2% -0.0 0.55 perf-profile.calltrace.cycles-pp.selinux_mmap_addr.security_mmap_addr.get_unmapped_area.do_brk_flags.__x64_sys_brk 0.54 ± 2% +0.0 0.56 perf-profile.calltrace.cycles-pp.do_brk_flags.brk 0.72 +0.0 0.76 ± 2% perf-profile.calltrace.cycles-pp.do_munmap.brk 4.38 +0.1 4.43 perf-profile.calltrace.cycles-pp.find_vma.do_munmap.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe 1.96 +0.1 2.04 perf-profile.calltrace.cycles-pp.vmacache_find.find_vma.do_munmap.__x64_sys_brk.do_syscall_64 0.53 +0.2 0.68 perf-profile.calltrace.cycles-pp.__vma_link_rb.brk 2.21 +0.3 2.51 perf-profile.calltrace.cycles-pp.remove_vma.do_munmap.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe 64.44 +0.5 64.90 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.brk 63.04 +0.5 63.54 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.brk 60.37 +0.5 60.88 perf-profile.calltrace.cycles-pp.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe.brk 3.75 +0.5 4.29 perf-profile.calltrace.cycles-pp.vma_link.do_brk_flags.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.00 +0.6 0.57 perf-profile.calltrace.cycles-pp.free_pgtables.unmap_region.do_munmap.__x64_sys_brk.do_syscall_64 0.00 +0.6 0.64 perf-profile.calltrace.cycles-pp.put_vma.remove_vma.do_munmap.__x64_sys_brk.do_syscall_64 0.72 +0.7 1.37 perf-profile.calltrace.cycles-pp.__vma_rb_erase.do_munmap.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe 24.42 +0.7 25.08 perf-profile.calltrace.cycles-pp.do_munmap.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe.brk 0.00 +0.7 0.71 perf-profile.calltrace.cycles-pp._raw_write_lock.__vma_rb_erase.do_munmap.__x64_sys_brk.do_syscall_64 3.12 +0.7 3.84 perf-profile.calltrace.cycles-pp.__vma_link_rb.vma_link.do_brk_flags.__x64_sys_brk.do_syscall_64 0.00 +0.8 0.77 perf-profile.calltrace.cycles-pp._raw_write_lock.__vma_link_rb.vma_link.do_brk_flags.__x64_sys_brk 0.00 +0.9 0.85 perf-profile.calltrace.cycles-pp.__vma_merge.do_brk_flags.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe 5.10 -0.5 4.60 perf-profile.children.cycles-pp.vma_compute_subtree_gap 4.53 -0.3 4.18 perf-profile.children.cycles-pp.kmem_cache_alloc 15.03 -0.3 14.77 perf-profile.children.cycles-pp.syscall_return_via_sysret 8.13 -0.2 7.92 perf-profile.children.cycles-pp.perf_event_mmap 8.01 -0.2 7.81 perf-profile.children.cycles-pp.unmap_vmas 0.97 ± 14% -0.2 0.78 perf-profile.children.cycles-pp.__vm_enough_memory 1.13 -0.1 1.00 perf-profile.children.cycles-pp.kmem_cache_free 7.82 -0.1 7.70 perf-profile.children.cycles-pp.unmap_page_range 12.23 -0.1 12.13 perf-profile.children.cycles-pp.unmap_region 0.74 -0.1 0.67 perf-profile.children.cycles-pp.sync_mm_rss 3.06 -0.1 3.00 perf-profile.children.cycles-pp.down_write_killable 0.40 ± 2% -0.1 0.34 perf-profile.children.cycles-pp.__rb_insert_augmented 1.29 -0.1 1.23 perf-profile.children.cycles-pp.__indirect_thunk_start 2.54 -0.1 2.49 perf-profile.children.cycles-pp.perf_iterate_sb 3.66 -0.0 3.61 perf-profile.children.cycles-pp.get_unmapped_area 1.80 -0.0 1.75 perf-profile.children.cycles-pp.strlcpy 0.53 ± 2% -0.0 0.49 ± 2% perf-profile.children.cycles-pp.cap_capable 1.57 -0.0 1.53 perf-profile.children.cycles-pp.arch_get_unmapped_area_topdown 1.11 -0.0 1.08 perf-profile.children.cycles-pp.memcpy_erms 0.13 -0.0 0.10 perf-profile.children.cycles-pp.__vma_link_file 0.55 -0.0 0.52 perf-profile.children.cycles-pp.unmap_single_vma 1.47 -0.0 1.44 perf-profile.children.cycles-pp.cap_vm_enough_memory 2.14 -0.0 2.12 perf-profile.children.cycles-pp.security_mmap_addr 0.32 -0.0 0.30 perf-profile.children.cycles-pp.userfaultfd_unmap_complete 1.25 -0.0 1.23 perf-profile.children.cycles-pp.up_write 0.50 -0.0 0.49 perf-profile.children.cycles-pp.userfaultfd_unmap_prep 0.27 -0.0 0.26 perf-profile.children.cycles-pp.tlb_flush_mmu_free 1.14 -0.0 1.12 perf-profile.children.cycles-pp.__might_sleep 0.07 -0.0 0.06 perf-profile.children.cycles-pp.should_failslab 0.72 +0.0 0.74 perf-profile.children.cycles-pp._cond_resched 0.45 +0.0 0.47 perf-profile.children.cycles-pp.rcu_all_qs 0.15 ± 3% +0.0 0.17 ± 4% perf-profile.children.cycles-pp.__vma_link_list 0.15 ± 5% +0.0 0.18 ± 5% perf-profile.children.cycles-pp.tick_sched_timer 0.05 ± 8% +0.1 0.12 ± 17% perf-profile.children.cycles-pp.perf_mux_hrtimer_handler 0.80 +0.1 0.89 perf-profile.children.cycles-pp.free_pgtables 0.22 ± 7% +0.1 0.31 ± 9% perf-profile.children.cycles-pp.__hrtimer_run_queues 0.00 +0.1 0.11 ± 15% perf-profile.children.cycles-pp.clockevents_program_event 6.34 +0.1 6.47 perf-profile.children.cycles-pp.find_vma 2.27 +0.1 2.40 perf-profile.children.cycles-pp.vmacache_find 0.40 ± 4% +0.2 0.58 ± 5% perf-profile.children.cycles-pp.apic_timer_interrupt 0.40 ± 4% +0.2 0.58 ± 5% perf-profile.children.cycles-pp.smp_apic_timer_interrupt 0.37 ± 4% +0.2 0.54 ± 5% perf-profile.children.cycles-pp.hrtimer_interrupt 0.00 +0.2 0.19 ± 12% perf-profile.children.cycles-pp.ktime_get 2.42 +0.3 2.77 perf-profile.children.cycles-pp.remove_vma 64.49 +0.5 64.94 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe 1.27 +0.5 1.73 perf-profile.children.cycles-pp.__vma_rb_erase 61.62 +0.5 62.10 perf-profile.children.cycles-pp.__x64_sys_brk 63.24 +0.5 63.74 perf-profile.children.cycles-pp.do_syscall_64 4.03 +0.5 4.56 perf-profile.children.cycles-pp.vma_link 0.00 +0.7 0.69 perf-profile.children.cycles-pp.put_vma 25.13 +0.7 25.84 perf-profile.children.cycles-pp.do_munmap 3.83 +0.7 4.56 perf-profile.children.cycles-pp.__vma_link_rb 0.00 +1.2 1.25 perf-profile.children.cycles-pp.__vma_merge 0.00 +1.5 1.53 perf-profile.children.cycles-pp._raw_write_lock 5.08 -0.5 4.58 perf-profile.self.cycles-pp.vma_compute_subtree_gap 15.03 -0.3 14.77 perf-profile.self.cycles-pp.syscall_return_via_sysret 0.59 -0.2 0.39 perf-profile.self.cycles-pp.remove_vma 0.72 ± 7% -0.1 0.58 perf-profile.self.cycles-pp.__vm_enough_memory 1.12 -0.1 0.99 perf-profile.self.cycles-pp.kmem_cache_free 3.11 -0.1 2.99 perf-profile.self.cycles-pp.do_munmap 0.99 -0.1 0.88 perf-profile.self.cycles-pp.__vma_rb_erase 3.63 -0.1 3.52 perf-profile.self.cycles-pp.perf_event_mmap 3.26 -0.1 3.17 perf-profile.self.cycles-pp.brk 0.41 ± 2% -0.1 0.33 perf-profile.self.cycles-pp.unmap_vmas 0.74 -0.1 0.67 perf-profile.self.cycles-pp.sync_mm_rss 1.75 -0.1 1.68 perf-profile.self.cycles-pp.kmem_cache_alloc 0.40 ± 2% -0.1 0.34 perf-profile.self.cycles-pp.__rb_insert_augmented 1.29 ± 2% -0.1 1.23 perf-profile.self.cycles-pp.__indirect_thunk_start 0.73 -0.0 0.68 ± 2% perf-profile.self.cycles-pp.unmap_region 0.53 -0.0 0.49 perf-profile.self.cycles-pp.vma_link 1.40 -0.0 1.35 perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe 5.22 -0.0 5.18 perf-profile.self.cycles-pp.unmap_page_range 0.53 ± 2% -0.0 0.49 ± 2% perf-profile.self.cycles-pp.cap_capable 1.11 -0.0 1.07 perf-profile.self.cycles-pp.memcpy_erms 1.86 -0.0 1.82 perf-profile.self.cycles-pp.perf_iterate_sb 1.30 -0.0 1.27 perf-profile.self.cycles-pp.arch_get_unmapped_area_topdown 0.13 -0.0 0.10 perf-profile.self.cycles-pp.__vma_link_file 0.55 -0.0 0.52 perf-profile.self.cycles-pp.unmap_single_vma 0.74 -0.0 0.72 perf-profile.self.cycles-pp.selinux_mmap_addr 0.32 -0.0 0.30 perf-profile.self.cycles-pp.userfaultfd_unmap_complete 1.13 -0.0 1.12 perf-profile.self.cycles-pp.__might_sleep 1.24 -0.0 1.23 perf-profile.self.cycles-pp.up_write 0.50 -0.0 0.49 perf-profile.self.cycles-pp.userfaultfd_unmap_prep 0.27 -0.0 0.26 perf-profile.self.cycles-pp.tlb_flush_mmu_free 0.07 -0.0 0.06 perf-profile.self.cycles-pp.should_failslab 0.45 +0.0 0.47 perf-profile.self.cycles-pp.rcu_all_qs 0.71 +0.0 0.73 perf-profile.self.cycles-pp.strlcpy 0.15 ± 3% +0.0 0.17 ± 4% perf-profile.self.cycles-pp.__vma_link_list 0.51 +0.1 0.57 perf-profile.self.cycles-pp.free_pgtables 1.40 +0.1 1.49 perf-profile.self.cycles-pp.__vma_link_rb 2.27 +0.1 2.39 perf-profile.self.cycles-pp.vmacache_find 0.00 +0.2 0.18 ± 12% perf-profile.self.cycles-pp.ktime_get 0.00 +0.7 0.69 perf-profile.self.cycles-pp.put_vma 0.00 +1.2 1.24 perf-profile.self.cycles-pp.__vma_merge 0.00 +1.5 1.52 perf-profile.self.cycles-pp._raw_write_lock ========================================================================================= tbox_group/testcase/rootfs/kconfig/compiler/nr_task/thp_enabled/test/cpufreq_governor: lkp-skl-4sp1/will-it-scale/debian-x86_64-2018-04-03.cgz/x86_64-rhel-7.2/gcc-7/100%/always/page_fault2/performance commit: ba98a1cdad71d259a194461b3a61471b49b14df1 a7a8993bfe3ccb54ad468b9f1799649e4ad1ff12 ba98a1cdad71d259 a7a8993bfe3ccb54ad468b9f17 ---------------- -------------------------- fail:runs %reproduction fail:runs | | | :3 33% 1:3 dmesg.WARNING:at#for_ip_native_iret/0x 1:3 -33% :3 dmesg.WARNING:stack_going_in_the_wrong_direction?ip=__schedule/0x :3 33% 1:3 dmesg.WARNING:stack_going_in_the_wrong_direction?ip=__slab_free/0x 1:3 -33% :3 kmsg.DHCP/BOOTP:Reply_not_for_us_on_eth#,op[#]xid[#] 3:3 -100% :3 kmsg.pstore:crypto_comp_decompress_failed,ret= 3:3 -100% :3 kmsg.pstore:decompression_failed 2:3 4% 2:3 perf-profile.calltrace.cycles-pp.sync_regs.error_entry 5:3 7% 5:3 perf-profile.calltrace.cycles-pp.error_entry 5:3 7% 5:3 perf-profile.children.cycles-pp.error_entry 2:3 3% 2:3 perf-profile.self.cycles-pp.error_entry %stddev %change %stddev \ | \ 8281 ± 2% -18.8% 6728 will-it-scale.per_thread_ops 92778 ± 2% +17.6% 109080 will-it-scale.time.involuntary_context_switches 21954366 ± 3% +4.1% 22857988 ± 2% will-it-scale.time.maximum_resident_set_size 4.81e+08 ± 2% -18.9% 3.899e+08 will-it-scale.time.minor_page_faults 5804 +12.2% 6512 will-it-scale.time.percent_of_cpu_this_job_got 34918 +12.2% 39193 will-it-scale.time.system_time 5638528 ± 2% -15.3% 4778392 will-it-scale.time.voluntary_context_switches 15846405 -2.0% 15531034 will-it-scale.workload 2818137 +1.5% 2861500 interrupts.CAL:Function_call_interrupts 3.33 ± 28% -60.0% 1.33 ± 93% irq_exception_noise.irq_time 2866 +23.9% 3552 ± 2% kthread_noise.total_time 5589674 ± 14% +31.4% 7344810 ± 6% meminfo.DirectMap2M 31169 -16.9% 25906 uptime.idle 25242 ± 4% -14.2% 21654 ± 6% vmstat.system.cs 7055 -11.6% 6237 boot-time.idle 21.12 +19.3% 25.19 ± 9% boot-time.kernel_boot 20.03 ± 2% -3.7 16.38 mpstat.cpu.idle% 0.00 ± 8% -0.0 0.00 ± 4% mpstat.cpu.iowait% 7284147 ± 2% -16.4% 6092495 softirqs.RCU 5350756 ± 2% -10.9% 4769417 ± 4% softirqs.SCHED 42933 ± 21% -28.2% 30807 ± 7% numa-meminfo.node2.SReclaimable 63219 ± 13% -16.6% 52717 ± 6% numa-meminfo.node2.SUnreclaim 106153 ± 16% -21.3% 83525 ± 5% numa-meminfo.node2.Slab 247154 ± 4% -7.6% 228415 numa-meminfo.node3.Unevictable 11904 ± 4% +17.1% 13945 ± 8% numa-vmstat.node0 2239 ± 22% -26.6% 1644 ± 2% numa-vmstat.node2.nr_mapped 10728 ± 21% -28.2% 7701 ± 7% numa-vmstat.node2.nr_slab_reclaimable 15803 ± 13% -16.6% 13179 ± 6% numa-vmstat.node2.nr_slab_unreclaimable 61788 ± 4% -7.6% 57103 numa-vmstat.node3.nr_unevictable 61788 ± 4% -7.6% 57103 numa-vmstat.node3.nr_zone_unevictable 92778 ± 2% +17.6% 109080 time.involuntary_context_switches 21954366 ± 3% +4.1% 22857988 ± 2% time.maximum_resident_set_size 4.81e+08 ± 2% -18.9% 3.899e+08 time.minor_page_faults 5804 +12.2% 6512 time.percent_of_cpu_this_job_got 34918 +12.2% 39193 time.system_time 5638528 ± 2% -15.3% 4778392 time.voluntary_context_switches 3942289 ± 2% -10.5% 3528902 ± 2% cpuidle.C1.time 242290 -14.2% 207992 cpuidle.C1.usage 1.64e+09 ± 2% -15.7% 1.381e+09 cpuidle.C1E.time 4621281 ± 2% -14.7% 3939757 cpuidle.C1E.usage 2.115e+10 ± 2% -18.5% 1.723e+10 cpuidle.C6.time 24771099 ± 2% -18.0% 20305766 cpuidle.C6.usage 1210810 ± 4% -17.6% 997270 ± 2% cpuidle.POLL.time 18742 ± 3% -17.0% 15559 ± 2% cpuidle.POLL.usage 4135 ±141% -100.0% 0.00 latency_stats.avg.x86_reserve_hardware.x86_pmu_event_init.perf_try_init_event.perf_event_alloc.__do_sys_perf_event_open.do_syscall_64.entry_SYSCALL_64_after_hwframe 33249 ±129% -100.0% 0.00 latency_stats.max.call_rwsem_down_read_failed.m_start.seq_read.__vfs_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe 4135 ±141% -100.0% 0.00 latency_stats.max.x86_reserve_hardware.x86_pmu_event_init.perf_try_init_event.perf_event_alloc.__do_sys_perf_event_open.do_syscall_64.entry_SYSCALL_64_after_hwframe 65839 ±116% -100.0% 0.00 latency_stats.sum.call_rwsem_down_read_failed.m_start.seq_read.__vfs_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe 4135 ±141% -100.0% 0.00 latency_stats.sum.x86_reserve_hardware.x86_pmu_event_init.perf_try_init_event.perf_event_alloc.__do_sys_perf_event_open.do_syscall_64.entry_SYSCALL_64_after_hwframe 8387 ±122% -90.9% 767.00 ± 13% latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_getattr.__nfs_revalidate_inode.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat 263970 ± 10% -68.6% 82994 ± 3% latency_stats.sum.do_syslog.kmsg_read.proc_reg_read.__vfs_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe 6173 ± 77% +173.3% 16869 ± 98% latency_stats.sum.pipe_write.__vfs_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe 101.33 -4.6% 96.67 proc-vmstat.nr_anon_transparent_hugepages 39967 -1.8% 39241 proc-vmstat.nr_slab_reclaimable 67166 -2.4% 65522 proc-vmstat.nr_slab_unreclaimable 237743 -3.9% 228396 proc-vmstat.nr_unevictable 237743 -3.9% 228396 proc-vmstat.nr_zone_unevictable 4.807e+09 -2.0% 4.71e+09 proc-vmstat.numa_hit 4.807e+09 -2.0% 4.71e+09 proc-vmstat.numa_local 4.791e+09 -2.1% 4.69e+09 proc-vmstat.pgalloc_normal 4.783e+09 -2.0% 4.685e+09 proc-vmstat.pgfault 4.807e+09 -2.0% 4.709e+09 proc-vmstat.pgfree 1753 +4.6% 1833 turbostat.Avg_MHz 239445 -14.1% 205783 turbostat.C1 4617105 ± 2% -14.8% 3934693 turbostat.C1E 1.40 ± 2% -0.2 1.18 turbostat.C1E% 24764661 ± 2% -18.0% 20297643 turbostat.C6 18.09 ± 2% -3.4 14.74 turbostat.C6% 7.53 ± 2% -17.1% 6.24 turbostat.CPU%c1 11.88 ± 2% -19.1% 9.61 turbostat.CPU%c6 7.62 ± 3% -20.8% 6.04 turbostat.Pkg%pc2 388.30 +1.5% 393.93 turbostat.PkgWatt 390974 ± 8% +35.8% 530867 ± 11% sched_debug.cfs_rq:/.min_vruntime.stddev -1754042 +75.7% -3081270 sched_debug.cfs_rq:/.spread0.min 388140 ± 8% +36.2% 528494 ± 11% sched_debug.cfs_rq:/.spread0.stddev 542.30 ± 3% -10.0% 488.21 ± 3% sched_debug.cfs_rq:/.util_avg.min 53.35 ± 16% +48.7% 79.35 ± 12% sched_debug.cfs_rq:/.util_est_enqueued.avg 30520 ± 6% -15.2% 25883 ± 12% sched_debug.cpu.nr_switches.avg 473770 ± 27% -37.4% 296623 ± 32% sched_debug.cpu.nr_switches.max 17077 ± 2% -15.1% 14493 sched_debug.cpu.nr_switches.min 30138 ± 6% -15.0% 25606 ± 12% sched_debug.cpu.sched_count.avg 472345 ± 27% -37.2% 296419 ± 32% sched_debug.cpu.sched_count.max 16858 ± 2% -15.2% 14299 sched_debug.cpu.sched_count.min 8358 ± 2% -15.5% 7063 sched_debug.cpu.sched_goidle.avg 12225 -13.6% 10565 sched_debug.cpu.sched_goidle.max 8032 ± 2% -16.0% 6749 sched_debug.cpu.sched_goidle.min 14839 ± 6% -15.3% 12568 ± 12% sched_debug.cpu.ttwu_count.avg 235115 ± 28% -38.3% 145175 ± 31% sched_debug.cpu.ttwu_count.max 7627 ± 3% -15.9% 6413 ± 2% sched_debug.cpu.ttwu_count.min 226299 ± 29% -39.5% 136827 ± 32% sched_debug.cpu.ttwu_local.max 0.85 -0.0 0.81 perf-stat.branch-miss-rate% 3.675e+10 -4.1% 3.523e+10 perf-stat.branch-misses 4.052e+11 -2.3% 3.958e+11 perf-stat.cache-misses 7.008e+11 -2.5% 6.832e+11 perf-stat.cache-references 15320995 ± 4% -14.3% 13136557 ± 6% perf-stat.context-switches 9.16 +4.8% 9.59 perf-stat.cpi 2.03e+14 +4.6% 2.124e+14 perf-stat.cpu-cycles 44508 -1.7% 43743 perf-stat.cpu-migrations 1.30 -0.1 1.24 perf-stat.dTLB-store-miss-rate% 4.064e+10 -3.5% 3.922e+10 perf-stat.dTLB-store-misses 3.086e+12 +1.1% 3.119e+12 perf-stat.dTLB-stores 3.611e+08 ± 6% -8.5% 3.304e+08 ± 5% perf-stat.iTLB-loads 0.11 -4.6% 0.10 perf-stat.ipc 4.783e+09 -2.0% 4.685e+09 perf-stat.minor-faults 1.53 ± 2% -0.3 1.22 ± 8% perf-stat.node-load-miss-rate% 1.389e+09 ± 3% -22.1% 1.083e+09 ± 9% perf-stat.node-load-misses 8.922e+10 -1.9% 8.75e+10 perf-stat.node-loads 5.06 +1.7 6.77 ± 3% perf-stat.node-store-miss-rate% 1.204e+09 +29.3% 1.556e+09 ± 3% perf-stat.node-store-misses 2.256e+10 -5.1% 2.142e+10 ± 2% perf-stat.node-stores 4.783e+09 -2.0% 4.685e+09 perf-stat.page-faults 1399242 +1.9% 1425404 perf-stat.path-length 1144 ± 8% -13.6% 988.00 ± 8% slabinfo.Acpi-ParseExt.active_objs 1144 ± 8% -13.6% 988.00 ± 8% slabinfo.Acpi-ParseExt.num_objs 1878 ± 17% +29.0% 2422 ± 16% slabinfo.dmaengine-unmap-16.active_objs 1878 ± 17% +29.0% 2422 ± 16% slabinfo.dmaengine-unmap-16.num_objs 1085 ± 5% -24.1% 823.33 ± 9% slabinfo.file_lock_cache.active_objs 1085 ± 5% -24.1% 823.33 ± 9% slabinfo.file_lock_cache.num_objs 61584 ± 4% -16.6% 51381 ± 5% slabinfo.filp.active_objs 967.00 ± 4% -16.5% 807.67 ± 5% slabinfo.filp.active_slabs 61908 ± 4% -16.5% 51713 ± 5% slabinfo.filp.num_objs 967.00 ± 4% -16.5% 807.67 ± 5% slabinfo.filp.num_slabs 1455 -15.4% 1232 ± 4% slabinfo.nsproxy.active_objs 1455 -15.4% 1232 ± 4% slabinfo.nsproxy.num_objs 84720 ± 6% -18.3% 69210 ± 4% slabinfo.pid.active_objs 1324 ± 6% -18.2% 1083 ± 4% slabinfo.pid.active_slabs 84820 ± 5% -18.2% 69386 ± 4% slabinfo.pid.num_objs 1324 ± 6% -18.2% 1083 ± 4% slabinfo.pid.num_slabs 2112 ± 18% -26.3% 1557 ± 5% slabinfo.scsi_sense_cache.active_objs 2112 ± 18% -26.3% 1557 ± 5% slabinfo.scsi_sense_cache.num_objs 5018 ± 5% -7.6% 4635 ± 4% slabinfo.sock_inode_cache.active_objs 5018 ± 5% -7.6% 4635 ± 4% slabinfo.sock_inode_cache.num_objs 1193 ± 4% +13.8% 1358 ± 4% slabinfo.task_group.active_objs 1193 ± 4% +13.8% 1358 ± 4% slabinfo.task_group.num_objs 62807 ± 3% -14.4% 53757 ± 3% slabinfo.vm_area_struct.active_objs 1571 ± 3% -12.1% 1381 ± 3% slabinfo.vm_area_struct.active_slabs 62877 ± 3% -14.3% 53880 ± 3% slabinfo.vm_area_struct.num_objs 1571 ± 3% -12.1% 1381 ± 3% slabinfo.vm_area_struct.num_slabs 47.45 -47.4 0.00 perf-profile.calltrace.cycles-pp.alloc_pages_vma.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault 47.16 -47.2 0.00 perf-profile.calltrace.cycles-pp.__alloc_pages_nodemask.alloc_pages_vma.__handle_mm_fault.handle_mm_fault.__do_page_fault 46.99 -47.0 0.00 perf-profile.calltrace.cycles-pp.get_page_from_freelist.__alloc_pages_nodemask.alloc_pages_vma.__handle_mm_fault.handle_mm_fault 44.95 -44.9 0.00 perf-profile.calltrace.cycles-pp._raw_spin_lock.get_page_from_freelist.__alloc_pages_nodemask.alloc_pages_vma.__handle_mm_fault 7.42 ± 2% -7.4 0.00 perf-profile.calltrace.cycles-pp.copy_page.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault 6.32 ± 10% -6.3 0.00 perf-profile.calltrace.cycles-pp.finish_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault 6.28 ± 10% -6.3 0.00 perf-profile.calltrace.cycles-pp.alloc_set_pte.finish_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault 0.00 +0.9 0.85 ± 11% perf-profile.calltrace.cycles-pp._raw_spin_lock.pte_map_lock.alloc_set_pte.finish_fault.handle_pte_fault 0.00 +0.9 0.92 ± 4% perf-profile.calltrace.cycles-pp.__list_del_entry_valid.get_page_from_freelist.__alloc_pages_nodemask.alloc_pages_vma.handle_pte_fault 0.00 +1.1 1.13 ± 7% perf-profile.calltrace.cycles-pp.find_lock_entry.shmem_getpage_gfp.shmem_fault.__do_fault.handle_pte_fault 0.00 +1.2 1.19 ± 7% perf-profile.calltrace.cycles-pp.shmem_getpage_gfp.shmem_fault.__do_fault.handle_pte_fault.__handle_mm_fault 0.00 +1.2 1.22 ± 5% perf-profile.calltrace.cycles-pp.pte_map_lock.alloc_set_pte.finish_fault.handle_pte_fault.__handle_mm_fault 0.00 +1.3 1.34 ± 7% perf-profile.calltrace.cycles-pp.shmem_fault.__do_fault.handle_pte_fault.__handle_mm_fault.handle_mm_fault 0.00 +1.4 1.36 ± 7% perf-profile.calltrace.cycles-pp.__do_fault.handle_pte_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault 0.00 +4.5 4.54 ± 19% perf-profile.calltrace.cycles-pp.pagevec_lru_move_fn.__lru_cache_add.alloc_set_pte.finish_fault.handle_pte_fault 0.00 +4.6 4.64 ± 19% perf-profile.calltrace.cycles-pp.__lru_cache_add.alloc_set_pte.finish_fault.handle_pte_fault.__handle_mm_fault 0.00 +6.6 6.64 ± 15% perf-profile.calltrace.cycles-pp.alloc_set_pte.finish_fault.handle_pte_fault.__handle_mm_fault.handle_mm_fault 0.00 +6.7 6.68 ± 15% perf-profile.calltrace.cycles-pp.finish_fault.handle_pte_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault 0.00 +7.5 7.54 ± 5% perf-profile.calltrace.cycles-pp.copy_page.handle_pte_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault 0.00 +44.6 44.55 ± 3% perf-profile.calltrace.cycles-pp._raw_spin_lock.get_page_from_freelist.__alloc_pages_nodemask.alloc_pages_vma.handle_pte_fault 0.00 +46.6 46.63 ± 3% perf-profile.calltrace.cycles-pp.get_page_from_freelist.__alloc_pages_nodemask.alloc_pages_vma.handle_pte_fault.__handle_mm_fault 0.00 +46.8 46.81 ± 3% perf-profile.calltrace.cycles-pp.__alloc_pages_nodemask.alloc_pages_vma.handle_pte_fault.__handle_mm_fault.handle_mm_fault 0.00 +47.1 47.10 ± 3% perf-profile.calltrace.cycles-pp.alloc_pages_vma.handle_pte_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault 0.00 +63.1 63.15 perf-profile.calltrace.cycles-pp.handle_pte_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault 0.39 ± 3% +0.0 0.42 ± 3% perf-profile.children.cycles-pp.radix_tree_lookup_slot 0.21 ± 3% +0.0 0.25 ± 5% perf-profile.children.cycles-pp.__mod_node_page_state 0.00 +0.1 0.06 ± 8% perf-profile.children.cycles-pp.get_vma_policy 0.00 +0.1 0.08 ± 5% perf-profile.children.cycles-pp.__lru_cache_add_active_or_unevictable 0.00 +0.2 0.18 ± 6% perf-profile.children.cycles-pp.__page_add_new_anon_rmap 0.00 +1.4 1.35 ± 5% perf-profile.children.cycles-pp.pte_map_lock 0.00 +63.2 63.21 perf-profile.children.cycles-pp.handle_pte_fault 1.40 ± 2% -0.4 1.03 ± 10% perf-profile.self.cycles-pp._raw_spin_lock 0.56 ± 3% -0.2 0.35 ± 6% perf-profile.self.cycles-pp.__handle_mm_fault 0.22 ± 3% -0.0 0.18 ± 7% perf-profile.self.cycles-pp.alloc_set_pte 0.09 +0.0 0.10 ± 4% perf-profile.self.cycles-pp.vmacache_find 0.39 ± 2% +0.0 0.41 ± 3% perf-profile.self.cycles-pp.__radix_tree_lookup 0.18 +0.0 0.20 ± 6% perf-profile.self.cycles-pp.mem_cgroup_charge_statistics 0.17 ± 2% +0.0 0.20 ± 7% perf-profile.self.cycles-pp.___might_sleep 0.33 ± 2% +0.0 0.36 ± 6% perf-profile.self.cycles-pp.handle_mm_fault 0.20 ± 2% +0.0 0.24 ± 3% perf-profile.self.cycles-pp.__mod_node_page_state 0.00 +0.1 0.05 perf-profile.self.cycles-pp.finish_fault 0.00 +0.1 0.05 perf-profile.self.cycles-pp.get_vma_policy 0.00 +0.1 0.08 ± 10% perf-profile.self.cycles-pp.__lru_cache_add_active_or_unevictable 0.00 +0.2 0.25 ± 5% perf-profile.self.cycles-pp.handle_pte_fault 0.00 +0.5 0.49 ± 8% perf-profile.self.cycles-pp.pte_map_lock ========================================================================================= tbox_group/testcase/rootfs/kconfig/compiler/nr_task/thp_enabled/test/cpufreq_governor: lkp-skl-4sp1/will-it-scale/debian-x86_64-2018-04-03.cgz/x86_64-rhel-7.2/gcc-7/100%/never/page_fault2/performance commit: ba98a1cdad71d259a194461b3a61471b49b14df1 a7a8993bfe3ccb54ad468b9f1799649e4ad1ff12 ba98a1cdad71d259 a7a8993bfe3ccb54ad468b9f17 ---------------- -------------------------- fail:runs %reproduction fail:runs | | | 1:3 -33% :3 kmsg.DHCP/BOOTP:Reply_not_for_us_on_eth#,op[#]xid[#] :3 33% 1:3 dmesg.WARNING:stack_going_in_the_wrong_direction?ip=sched_slice/0x 1:3 -33% :3 dmesg.WARNING:stack_going_in_the_wrong_direction?ip=schedule_tail/0x 1:3 24% 2:3 perf-profile.calltrace.cycles-pp.sync_regs.error_entry 3:3 46% 5:3 perf-profile.calltrace.cycles-pp.error_entry 5:3 -9% 5:3 perf-profile.children.cycles-pp.error_entry 2:3 -4% 2:3 perf-profile.self.cycles-pp.error_entry %stddev %change %stddev \ | \ 8147 -18.8% 6613 will-it-scale.per_thread_ops 93113 +17.0% 108982 will-it-scale.time.involuntary_context_switches 4.732e+08 -19.0% 3.833e+08 will-it-scale.time.minor_page_faults 5854 +12.0% 6555 will-it-scale.time.percent_of_cpu_this_job_got 35247 +12.1% 39495 will-it-scale.time.system_time 5546661 -15.5% 4689314 will-it-scale.time.voluntary_context_switches 15801637 -1.9% 15504487 will-it-scale.workload 1.43 ± 11% -59.7% 0.58 ± 28% irq_exception_noise.__do_page_fault.min 2811 ± 3% +23.7% 3477 ± 3% kthread_noise.total_time 292776 ± 5% +39.6% 408829 ± 21% meminfo.DirectMap4k 19.80 -3.7 16.12 mpstat.cpu.idle% 29940 -14.5% 25593 uptime.idle 24064 ± 3% -8.5% 22016 vmstat.system.cs 34.86 -1.9% 34.19 boot-time.boot 26.95 -2.8% 26.19 ± 2% boot-time.kernel_boot 7190569 ± 2% -15.2% 6100136 ± 3% softirqs.RCU 5513663 -13.8% 4751548 softirqs.SCHED 18064 ± 2% +24.3% 22461 ± 7% numa-vmstat.node0.nr_slab_unreclaimable 8507 ± 12% -16.8% 7075 ± 4% numa-vmstat.node2.nr_slab_reclaimable 18719 ± 9% -19.6% 15043 ± 4% numa-vmstat.node3.nr_slab_unreclaimable 72265 ± 2% +24.3% 89855 ± 7% numa-meminfo.node0.SUnreclaim 115980 ± 4% +22.6% 142233 ± 12% numa-meminfo.node0.Slab 34035 ± 12% -16.8% 28307 ± 4% numa-meminfo.node2.SReclaimable 74888 ± 9% -19.7% 60162 ± 4% numa-meminfo.node3.SUnreclaim 93113 +17.0% 108982 time.involuntary_context_switches 4.732e+08 -19.0% 3.833e+08 time.minor_page_faults 5854 +12.0% 6555 time.percent_of_cpu_this_job_got 35247 +12.1% 39495 time.system_time 5546661 -15.5% 4689314 time.voluntary_context_switches 4.792e+09 -1.9% 4.699e+09 proc-vmstat.numa_hit 4.791e+09 -1.9% 4.699e+09 proc-vmstat.numa_local 40447 ± 11% +13.2% 45804 ± 6% proc-vmstat.pgactivate 4.778e+09 -1.9% 4.688e+09 proc-vmstat.pgalloc_normal 4.767e+09 -1.9% 4.675e+09 proc-vmstat.pgfault 4.791e+09 -1.9% 4.699e+09 proc-vmstat.pgfree 230178 ± 2% -10.1% 206883 ± 3% cpuidle.C1.usage 1.617e+09 -15.0% 1.375e+09 cpuidle.C1E.time 4514401 -14.1% 3878206 cpuidle.C1E.usage 2.087e+10 -18.5% 1.701e+10 cpuidle.C6.time 24458365 -18.0% 20045336 cpuidle.C6.usage 1163758 -16.1% 976094 ± 4% cpuidle.POLL.time 17907 -14.6% 15294 ± 4% cpuidle.POLL.usage 1758 +4.5% 1838 turbostat.Avg_MHz 227522 ± 2% -10.2% 204426 ± 3% turbostat.C1 4512700 -14.2% 3873264 turbostat.C1E 1.39 -0.2 1.18 turbostat.C1E% 24452583 -18.0% 20039031 turbostat.C6 17.85 -3.3 14.55 turbostat.C6% 7.44 -16.8% 6.19 turbostat.CPU%c1 11.72 -19.3% 9.45 turbostat.CPU%c6 7.51 -21.3% 5.91 turbostat.Pkg%pc2 389.33 +1.6% 395.59 turbostat.PkgWatt 559.33 ± 13% -17.9% 459.33 ± 20% slabinfo.dmaengine-unmap-128.active_objs 559.33 ± 13% -17.9% 459.33 ± 20% slabinfo.dmaengine-unmap-128.num_objs 57734 ± 3% -5.7% 54421 ± 4% slabinfo.filp.active_objs 905.67 ± 3% -5.6% 854.67 ± 4% slabinfo.filp.active_slabs 57981 ± 3% -5.6% 54720 ± 4% slabinfo.filp.num_objs 905.67 ± 3% -5.6% 854.67 ± 4% slabinfo.filp.num_slabs 1378 -12.0% 1212 ± 7% slabinfo.nsproxy.active_objs 1378 -12.0% 1212 ± 7% slabinfo.nsproxy.num_objs 507.33 ± 7% -26.8% 371.33 ± 2% slabinfo.secpath_cache.active_objs 507.33 ± 7% -26.8% 371.33 ± 2% slabinfo.secpath_cache.num_objs 4788 ± 5% -8.3% 4391 ± 2% slabinfo.sock_inode_cache.active_objs 4788 ± 5% -8.3% 4391 ± 2% slabinfo.sock_inode_cache.num_objs 1431 ± 8% -12.3% 1255 ± 3% slabinfo.task_group.active_objs 1431 ± 8% -12.3% 1255 ± 3% slabinfo.task_group.num_objs 4.27 ± 17% +27.0% 5.42 ± 7% sched_debug.cfs_rq:/.runnable_load_avg.avg 13.44 ± 62% +73.6% 23.33 ± 24% sched_debug.cfs_rq:/.runnable_load_avg.stddev 772.55 ± 21% -32.7% 520.27 ± 4% sched_debug.cfs_rq:/.util_est_enqueued.max 4.39 ± 15% +29.0% 5.66 ± 11% sched_debug.cpu.cpu_load[0].avg 152.09 ± 72% +83.9% 279.67 ± 33% sched_debug.cpu.cpu_load[0].max 13.84 ± 58% +78.7% 24.72 ± 29% sched_debug.cpu.cpu_load[0].stddev 4.53 ± 14% +25.8% 5.70 ± 10% sched_debug.cpu.cpu_load[1].avg 156.58 ± 66% +76.6% 276.58 ± 33% sched_debug.cpu.cpu_load[1].max 14.02 ± 55% +72.4% 24.17 ± 28% sched_debug.cpu.cpu_load[1].stddev 4.87 ± 11% +17.3% 5.72 ± 9% sched_debug.cpu.cpu_load[2].avg 1.58 ± 2% +13.5% 1.79 ± 6% sched_debug.cpu.nr_running.max 16694 -14.6% 14259 sched_debug.cpu.nr_switches.min 31989 ± 13% +20.6% 38584 ± 6% sched_debug.cpu.nr_switches.stddev 16505 -14.8% 14068 sched_debug.cpu.sched_count.min 32084 ± 13% +19.9% 38482 ± 6% sched_debug.cpu.sched_count.stddev 8185 -15.0% 6957 sched_debug.cpu.sched_goidle.avg 12151 ± 2% -13.5% 10507 sched_debug.cpu.sched_goidle.max 7867 -15.7% 6631 sched_debug.cpu.sched_goidle.min 7595 -16.1% 6375 sched_debug.cpu.ttwu_count.min 15873 ± 13% +21.2% 19239 ± 6% sched_debug.cpu.ttwu_count.stddev 5244 ± 17% +17.0% 6134 ± 5% sched_debug.cpu.ttwu_local.avg 15646 ± 12% +21.5% 19008 ± 6% sched_debug.cpu.ttwu_local.stddev 0.85 -0.0 0.81 perf-stat.branch-miss-rate% 3.689e+10 -4.6% 3.518e+10 perf-stat.branch-misses 57.39 +0.6 58.00 perf-stat.cache-miss-rate% 4.014e+11 -1.2% 3.967e+11 perf-stat.cache-misses 6.994e+11 -2.2% 6.84e+11 perf-stat.cache-references 14605393 ± 3% -8.5% 13369913 perf-stat.context-switches 9.21 +4.5% 9.63 perf-stat.cpi 2.037e+14 +4.6% 2.13e+14 perf-stat.cpu-cycles 44424 -2.0% 43541 perf-stat.cpu-migrations 1.29 -0.1 1.24 perf-stat.dTLB-store-miss-rate% 4.018e+10 -2.8% 3.905e+10 perf-stat.dTLB-store-misses 3.071e+12 +1.4% 3.113e+12 perf-stat.dTLB-stores 93.04 +1.5 94.51 perf-stat.iTLB-load-miss-rate% 4.946e+09 +19.3% 5.903e+09 ± 5% perf-stat.iTLB-load-misses 3.702e+08 -7.5% 3.423e+08 ± 2% perf-stat.iTLB-loads 4470 -15.9% 3760 ± 5% perf-stat.instructions-per-iTLB-miss 0.11 -4.3% 0.10 perf-stat.ipc 4.767e+09 -1.9% 4.675e+09 perf-stat.minor-faults 1.46 ± 4% -0.1 1.33 ± 9% perf-stat.node-load-miss-rate% 4.91 +1.7 6.65 ± 2% perf-stat.node-store-miss-rate% 1.195e+09 +32.8% 1.587e+09 ± 2% perf-stat.node-store-misses 2.313e+10 -3.7% 2.227e+10 perf-stat.node-stores 4.767e+09 -1.9% 4.675e+09 perf-stat.page-faults 1399047 +2.0% 1427115 perf-stat.path-length 8908 ± 73% -100.0% 0.00 latency_stats.avg.call_rwsem_down_read_failed.m_start.seq_read.__vfs_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe 3604 ±141% -100.0% 0.00 latency_stats.avg.call_rwsem_down_write_failed.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe 61499 ±130% -92.6% 4534 ± 16% latency_stats.avg.expand_files.__alloc_fd.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe 4391 ±138% -70.9% 1277 ±129% latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup_revalidate.lookup_fast.walk_component.link_path_walk.path_lookupat.filename_lookup 67311 ±112% -48.5% 34681 ± 36% latency_stats.avg.max 3956 ±138% +320.4% 16635 ±140% latency_stats.avg.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_getattr.__nfs_revalidate_inode.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat 164.67 ± 30% +7264.0% 12126 ±138% latency_stats.avg.flush_work.fsnotify_destroy_group.inotify_release.__fput.task_work_run.exit_to_usermode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.00 +5.4e+105% 5367 ±141% latency_stats.avg.call_rwsem_down_write_failed.unlink_file_vma.free_pgtables.exit_mmap.mmput.flush_old_exec.load_elf_binary.search_binary_handler.do_execveat_common.__x64_sys_execve.do_syscall_64.entry_SYSCALL_64_after_hwframe 36937 ±119% -100.0% 0.00 latency_stats.max.call_rwsem_down_read_failed.m_start.seq_read.__vfs_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe 3604 ±141% -100.0% 0.00 latency_stats.max.call_rwsem_down_write_failed.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe 84146 ±107% -72.5% 23171 ± 31% latency_stats.max.expand_files.__alloc_fd.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe 4391 ±138% -70.9% 1277 ±129% latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup_revalidate.lookup_fast.walk_component.link_path_walk.path_lookupat.filename_lookup 5817 ± 83% -69.7% 1760 ± 67% latency_stats.max.pipe_write.__vfs_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe 6720 ±137% +1628.2% 116147 ±141% latency_stats.max.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_getattr.__nfs_revalidate_inode.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat 164.67 ± 30% +7264.0% 12126 ±138% latency_stats.max.flush_work.fsnotify_destroy_group.inotify_release.__fput.task_work_run.exit_to_usermode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.00 +1.2e+106% 12153 ±141% latency_stats.max.call_rwsem_down_write_failed.unlink_file_vma.free_pgtables.exit_mmap.mmput.flush_old_exec.load_elf_binary.search_binary_handler.do_execveat_common.__x64_sys_execve.do_syscall_64.entry_SYSCALL_64_after_hwframe 110122 ±120% -100.0% 0.00 latency_stats.sum.call_rwsem_down_read_failed.m_start.seq_read.__vfs_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe 3604 ±141% -100.0% 0.00 latency_stats.sum.call_rwsem_down_write_failed.do_unlinkat.do_syscall_64.entry_SYSCALL_64_after_hwframe 12078828 ±139% -99.3% 89363 ± 29% latency_stats.sum.expand_files.__alloc_fd.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe 144453 ±120% -80.9% 27650 ± 19% latency_stats.sum.poll_schedule_timeout.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe 4391 ±138% -70.9% 1277 ±129% latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_lookup.nfs_lookup_revalidate.lookup_fast.walk_component.link_path_walk.path_lookupat.filename_lookup 9438 ± 86% -68.4% 2980 ± 35% latency_stats.sum.pipe_write.__vfs_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe 31656 ±138% +320.4% 133084 ±140% latency_stats.sum.rpc_wait_bit_killable.__rpc_execute.rpc_run_task.rpc_call_sync.nfs3_rpc_wrapper.nfs3_proc_getattr.__nfs_revalidate_inode.nfs_do_access.nfs_permission.inode_permission.link_path_walk.path_lookupat 164.67 ± 30% +7264.0% 12126 ±138% latency_stats.sum.flush_work.fsnotify_destroy_group.inotify_release.__fput.task_work_run.exit_to_usermode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.00 +8.8e+105% 8760 ±141% latency_stats.sum.msleep_interruptible.uart_wait_until_sent.tty_wait_until_sent.tty_port_close_start.tty_port_close.tty_release.__fput.task_work_run.exit_to_usermode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.00 +1.3e+106% 12897 ±141% latency_stats.sum.tty_wait_until_sent.tty_port_close_start.tty_port_close.tty_release.__fput.task_work_run.exit_to_usermode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe 0.00 +3.2e+106% 32207 ±141% latency_stats.sum.call_rwsem_down_write_failed.unlink_file_vma.free_pgtables.exit_mmap.mmput.flush_old_exec.load_elf_binary.search_binary_handler.do_execveat_common.__x64_sys_execve.do_syscall_64.entry_SYSCALL_64_after_hwframe 44.43 ± 3% -44.4 0.00 perf-profile.calltrace.cycles-pp.alloc_pages_vma.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault 44.13 ± 3% -44.1 0.00 perf-profile.calltrace.cycles-pp.__alloc_pages_nodemask.alloc_pages_vma.__handle_mm_fault.handle_mm_fault.__do_page_fault 43.95 ± 3% -43.9 0.00 perf-profile.calltrace.cycles-pp.get_page_from_freelist.__alloc_pages_nodemask.alloc_pages_vma.__handle_mm_fault.handle_mm_fault 41.85 ± 4% -41.9 0.00 perf-profile.calltrace.cycles-pp._raw_spin_lock.get_page_from_freelist.__alloc_pages_nodemask.alloc_pages_vma.__handle_mm_fault 7.74 ± 8% -7.7 0.00 perf-profile.calltrace.cycles-pp.copy_page.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault 7.19 ± 4% -7.2 0.00 perf-profile.calltrace.cycles-pp.finish_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault 7.15 ± 4% -7.2 0.00 perf-profile.calltrace.cycles-pp.alloc_set_pte.finish_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault 5.09 ± 3% -5.1 0.00 perf-profile.calltrace.cycles-pp.__lru_cache_add.alloc_set_pte.finish_fault.__handle_mm_fault.handle_mm_fault 4.99 ± 3% -5.0 0.00 perf-profile.calltrace.cycles-pp.pagevec_lru_move_fn.__lru_cache_add.alloc_set_pte.finish_fault.__handle_mm_fault 0.93 ± 6% -0.1 0.81 ± 2% perf-profile.calltrace.cycles-pp.find_get_entry.find_lock_entry.shmem_getpage_gfp.shmem_fault.__do_fault 0.00 +0.8 0.84 perf-profile.calltrace.cycles-pp._raw_spin_lock.pte_map_lock.alloc_set_pte.finish_fault.handle_pte_fault 0.00 +0.9 0.92 ± 3% perf-profile.calltrace.cycles-pp.__list_del_entry_valid.get_page_from_freelist.__alloc_pages_nodemask.alloc_pages_vma.handle_pte_fault 0.00 +1.1 1.08 perf-profile.calltrace.cycles-pp.find_lock_entry.shmem_getpage_gfp.shmem_fault.__do_fault.handle_pte_fault 0.00 +1.1 1.14 perf-profile.calltrace.cycles-pp.shmem_getpage_gfp.shmem_fault.__do_fault.handle_pte_fault.__handle_mm_fault 0.00 +1.2 1.17 perf-profile.calltrace.cycles-pp.pte_map_lock.alloc_set_pte.finish_fault.handle_pte_fault.__handle_mm_fault 0.00 +1.3 1.29 perf-profile.calltrace.cycles-pp.shmem_fault.__do_fault.handle_pte_fault.__handle_mm_fault.handle_mm_fault 0.00 +1.3 1.31 perf-profile.calltrace.cycles-pp.__do_fault.handle_pte_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault 61.62 +1.7 63.33 perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault 41.73 ± 4% +3.0 44.75 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.get_page_from_freelist.__alloc_pages_nodemask.alloc_pages_vma 0.00 +4.6 4.55 ± 15% perf-profile.calltrace.cycles-pp.pagevec_lru_move_fn.__lru_cache_add.alloc_set_pte.finish_fault.handle_pte_fault 0.00 +4.6 4.65 ± 14% perf-profile.calltrace.cycles-pp.__lru_cache_add.alloc_set_pte.finish_fault.handle_pte_fault.__handle_mm_fault 0.00 +6.6 6.57 ± 10% perf-profile.calltrace.cycles-pp.alloc_set_pte.finish_fault.handle_pte_fault.__handle_mm_fault.handle_mm_fault 0.00 +6.6 6.61 ± 10% perf-profile.calltrace.cycles-pp.finish_fault.handle_pte_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault 0.00 +7.2 7.25 ± 2% perf-profile.calltrace.cycles-pp.copy_page.handle_pte_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault 41.41 ± 70% +22.3 63.67 perf-profile.calltrace.cycles-pp.handle_mm_fault.__do_page_fault.do_page_fault.page_fault 42.19 ± 70% +22.6 64.75 perf-profile.calltrace.cycles-pp.__do_page_fault.do_page_fault.page_fault 42.20 ± 70% +22.6 64.76 perf-profile.calltrace.cycles-pp.do_page_fault.page_fault 42.27 ± 70% +22.6 64.86 perf-profile.calltrace.cycles-pp.page_fault 0.00 +44.9 44.88 perf-profile.calltrace.cycles-pp._raw_spin_lock.get_page_from_freelist.__alloc_pages_nodemask.alloc_pages_vma.handle_pte_fault 0.00 +46.9 46.92 perf-profile.calltrace.cycles-pp.get_page_from_freelist.__alloc_pages_nodemask.alloc_pages_vma.handle_pte_fault.__handle_mm_fault 0.00 +47.1 47.10 perf-profile.calltrace.cycles-pp.__alloc_pages_nodemask.alloc_pages_vma.handle_pte_fault.__handle_mm_fault.handle_mm_fault 0.00 +47.4 47.37 perf-profile.calltrace.cycles-pp.alloc_pages_vma.handle_pte_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault 0.00 +63.0 63.00 perf-profile.calltrace.cycles-pp.handle_pte_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault 0.97 ± 6% -0.1 0.84 ± 2% perf-profile.children.cycles-pp.find_get_entry 1.23 ± 6% -0.1 1.11 perf-profile.children.cycles-pp.find_lock_entry 0.09 ± 10% -0.0 0.07 ± 6% perf-profile.children.cycles-pp.unlock_page 0.19 ± 4% +0.0 0.21 ± 2% perf-profile.children.cycles-pp.mem_cgroup_charge_statistics 0.21 ± 2% +0.0 0.25 perf-profile.children.cycles-pp.__mod_node_page_state 0.00 +0.1 0.05 ± 8% perf-profile.children.cycles-pp.get_vma_policy 0.00 +0.1 0.08 perf-profile.children.cycles-pp.__lru_cache_add_active_or_unevictable 0.00 +0.2 0.18 ± 2% perf-profile.children.cycles-pp.__page_add_new_anon_rmap 0.00 +1.3 1.30 perf-profile.children.cycles-pp.pte_map_lock 63.40 +1.6 64.97 perf-profile.children.cycles-pp.__do_page_fault 63.19 +1.6 64.83 perf-profile.children.cycles-pp.do_page_fault 61.69 +1.7 63.36 perf-profile.children.cycles-pp.__handle_mm_fault 63.19 +1.7 64.86 perf-profile.children.cycles-pp.page_fault 61.99 +1.7 63.70 perf-profile.children.cycles-pp.handle_mm_fault 72.27 +2.2 74.52 perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath 67.51 +2.4 69.87 perf-profile.children.cycles-pp._raw_spin_lock 44.49 ± 3% +3.0 47.45 perf-profile.children.cycles-pp.alloc_pages_vma 44.28 ± 3% +3.0 47.26 perf-profile.children.cycles-pp.__alloc_pages_nodemask 44.13 ± 3% +3.0 47.12 perf-profile.children.cycles-pp.get_page_from_freelist 0.00 +63.1 63.06 perf-profile.children.cycles-pp.handle_pte_fault 1.46 ± 7% -0.5 1.01 perf-profile.self.cycles-pp._raw_spin_lock 0.58 ± 6% -0.2 0.34 perf-profile.self.cycles-pp.__handle_mm_fault 0.55 ± 6% -0.1 0.44 ± 2% perf-profile.self.cycles-pp.find_get_entry 0.22 ± 5% -0.1 0.16 ± 2% perf-profile.self.cycles-pp.alloc_set_pte 0.10 ± 8% -0.0 0.08 perf-profile.self.cycles-pp.down_read_trylock 0.09 ± 5% -0.0 0.07 perf-profile.self.cycles-pp.unlock_page 0.06 -0.0 0.05 perf-profile.self.cycles-pp.pmd_devmap_trans_unstable 0.20 ± 2% +0.0 0.24 ± 3% perf-profile.self.cycles-pp.__mod_node_page_state 0.00 +0.1 0.05 perf-profile.self.cycles-pp.finish_fault 0.00 +0.1 0.05 perf-profile.self.cycles-pp.get_vma_policy 0.00 +0.1 0.08 ± 6% perf-profile.self.cycles-pp.__lru_cache_add_active_or_unevictable 0.00 +0.2 0.25 perf-profile.self.cycles-pp.handle_pte_fault 0.00 +0.5 0.46 ± 7% perf-profile.self.cycles-pp.pte_map_lock 72.26 +2.3 74.52 perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
<<attachment: perf-profile.zip>>