On Thu, Jan 03, 2019 at 09:10:13AM -0800, Yang Shi wrote: > How about the below description: > > The test with page_fault1 of will-it-scale (sometimes tracing may just show > runtest.py that is the wrapper script of page_fault1), which basically > launches NR_CPU threads to generate 128MB anonymous pages for each thread, > on my virtual machine with congested HDD shows long tail latency is reduced > significantly. > > Without the patch > page_fault1_thr-1490 [023] 129.311706: funcgraph_entry: #57377.796 us | > do_swap_page(); > page_fault1_thr-1490 [023] 129.369103: funcgraph_entry: 5.642us | > do_swap_page(); > page_fault1_thr-1490 [023] 129.369119: funcgraph_entry: #1289.592 us | > do_swap_page(); > page_fault1_thr-1490 [023] 129.370411: funcgraph_entry: 4.957us | > do_swap_page(); > page_fault1_thr-1490 [023] 129.370419: funcgraph_entry: 1.940us | > do_swap_page(); > page_fault1_thr-1490 [023] 129.378847: funcgraph_entry: #1411.385 us | > do_swap_page(); > page_fault1_thr-1490 [023] 129.380262: funcgraph_entry: 3.916us | > do_swap_page(); > page_fault1_thr-1490 [023] 129.380275: funcgraph_entry: #4287.751 us | > do_swap_page(); > > With the patch > runtest.py-1417 [020] 301.925911: funcgraph_entry: #9870.146 us | > do_swap_page(); > runtest.py-1417 [020] 301.935785: funcgraph_entry: 9.802us | > do_swap_page(); > runtest.py-1417 [020] 301.935799: funcgraph_entry: 3.551us | > do_swap_page(); > runtest.py-1417 [020] 301.935806: funcgraph_entry: 2.142us | > do_swap_page(); > runtest.py-1417 [020] 301.935853: funcgraph_entry: 6.938us | > do_swap_page(); > runtest.py-1417 [020] 301.935864: funcgraph_entry: 3.765us | > do_swap_page(); > runtest.py-1417 [020] 301.935871: funcgraph_entry: 3.600us | > do_swap_page(); > runtest.py-1417 [020] 301.935878: funcgraph_entry: 7.202us | > do_swap_page(); That's better, thanks!