On 03/20/2018 04:54 AM, Aaron Lu wrote:
This series is meant to improve zone->lock scalability for order 0 pages. With will-it-scale/page_fault1 workload, on a 2 sockets Intel Skylake server with 112 CPUs, CPU spend 80% of its time spinning on zone->lock. Perf profile shows the most time consuming part under zone->lock is the cache miss on "struct page", so here I'm trying to avoid those cache misses.
I ran page_fault1 comparing 4.16-rc5 to your recent work, these four patches plus the three others from your github branch zone_lock_rfc_v2. Out of curiosity I also threw in another 4.16-rc5 with the pcp batch size adjusted so high (10922 pages) that we always stay in the pcp lists and out of buddy completely. I used your patch[*] in this last kernel.
This was on a 2-socket, 20-core broadwell server.There were some small regressions a bit outside the noise at low process counts (2-5) but I'm not sure they're repeatable. Anyway, it does improve the microbenchmark across the board.
[*] lkml.kernel.org/r/20170919072342.GB7263 () intel ! com
Attachment:
gnuplot-pgfs-vs-ntask-iter1.png
Description: PNG image
,,586305.0,747,587731.0,1766 4.0,3.4,609505.0,1563,608007.0,1170 8.0,5.9,633145.0,1752,622690.0,1287 ,,1131428.0,7397,1022890.0,7334 -1.0,-0.3,1119974.0,2558,1020102.0,5707 3.0,3.3,1165004.0,6232,1056689.0,6411 ,,1590413.0,6412,1346900.0,8925 -0.7,1.0,1579816.0,7217,1360376.0,4418 2.3,3.4,1626925.0,5321,1392515.0,8180 ,,2064476.0,5035,1656475.0,14714 -1.0,0.9,2043240.0,9070,1672036.0,5342 1.2,3.9,2089959.0,8287,1721614.0,7797 ,,2486090.0,11178,1878085.0,15286 -0.1,1.1,2483021.0,15100,1898459.0,9295 1.3,4.0,2517602.0,13717,1952995.0,7481 ,,2869756.0,9194,2058398.0,20580 0.4,3.5,2882220.0,20584,2129444.0,9689 2.4,6.0,2937618.0,14126,2182859.0,9650 ,,3242589.0,15354,2231977.0,20188 0.9,3.2,3270796.0,15125,2303780.0,6607 2.0,6.7,3306528.0,17683,2381279.0,16507 ,,3598209.0,10765,2361819.0,13509 1.2,4.5,3642407.0,14894,2469191.0,16250 2.0,8.1,3671834.0,17501,2552786.0,12112 ,,3974345.0,12605,2511565.0,31986 2.3,4.8,4067553.0,11070,2632608.0,8158 2.5,9.8,4073111.0,12464,2758075.0,31433 ,,4333026.0,12187,2636914.0,15065 2.4,5.8,4435852.0,21692,2789949.0,16400 3.2,10.3,4470666.0,23663,2907263.0,15052 ,,4932423.0,12184,2675769.0,23925 2.7,3.6,5064666.0,18600,2771476.0,22438 3.6,7.9,5110434.0,21460,2888419.0,18181 ,,5461255.0,14704,2631232.0,24390 1.7,2.5,5554957.0,20979,2697554.0,19370 3.1,5.8,5629143.0,22781,2782902.0,20347 ,,5924367.0,11835,2445607.0,25821 1.4,4.9,6004723.0,19071,2566547.0,26031 2.9,8.4,6094087.0,17676,2651793.0,20051 ,,6381611.0,16792,2277611.0,39558 1.2,4.4,6459693.0,18869,2377795.0,25094 2.4,12.4,6534837.0,24991,2560085.0,12638 ,,6804737.0,13654,2232121.0,19409 1.1,4.8,6881730.0,18995,2338868.0,18923 2.4,12.5,6970318.0,28677,2510594.0,25891 ,,7197145.0,17500,2313168.0,16694 1.2,4.5,7287072.0,28727,2418232.0,23120 2.6,7.6,7383613.0,17992,2489382.0,28385 ,,7550498.0,15101,2226427.0,24769 1.6,4.4,7667641.0,24306,2324675.0,16265 2.8,5.3,7761917.0,29855,2345195.0,20660 ,,7902794.0,12579,2188399.0,37454 1.7,6.0,8033876.0,21158,2320641.0,13053 2.8,8.9,8126732.0,27620,2383083.0,25173 ,,8277506.0,15448,2198021.0,33075 2.1,6.4,8453255.0,17411,2339395.0,20221 3.0,7.7,8529130.0,29853,2366800.0,20139 ,,8651034.0,19706,2239626.0,25694 2.2,3.2,8840988.0,24387,2311966.0,25694 3.1,9.3,8918721.0,34762,2448589.0,26228 ,,8777023.0,11833,2259622.0,32155 2.5,5.7,8993348.0,25481,2389000.0,29464 3.5,6.3,9085319.0,27791,2401713.0,28706 ,,8855202.0,27455,2268030.0,35914 3.0,4.1,9123705.0,34289,2361876.0,31917 4.2,11.9,9228843.0,33484,2536867.0,26001 ,,8952897.0,21601,2280539.0,30530 3.5,10.1,9268365.0,30804,2510883.0,26312 4.6,10.3,9367048.0,30681,2514740.0,32897 ,,9036582.0,19483,2374728.0,42892 3.7,2.2,9369541.0,33253,2425993.0,45953 5.0,4.5,9489640.0,28066,2482179.0,33821 ,,9136041.0,18233,2336090.0,30037 4.0,6.1,9497501.0,34409,2478602.0,31960 4.7,10.5,9563832.0,34507,2581056.0,38516 ,,9226630.0,17998,2326070.0,33782 3.7,7.8,9570547.0,27052,2508396.0,41848 4.4,10.4,9634192.0,40218,2566842.0,31116 ,,9305784.0,24574,2391261.0,29548 3.8,6.1,9656252.0,39164,2536624.0,45738 4.5,5.2,9720210.0,31172,2516447.0,32348 ,,9381004.0,19378,2442774.0,35745 2.6,1.3,9626125.0,65187,2474560.0,29978 4.1,4.4,9766045.0,54298,2549227.0,31200 ,,9401844.0,27746,2456372.0,40550 2.7,3.8,9652161.0,51629,2549004.0,39991 3.5,6.9,9731681.0,51852,2625589.0,27822 ,,9428320.0,17562,2509119.0,39752 2.1,6.1,9630472.0,50106,2662447.0,44347 3.1,5.7,9722152.0,50349,2651891.0,28519 ,,9561062.0,21910,2392883.0,24181 2.0,7.1,9755774.0,60382,2563573.0,32132 3.5,15.0,9894735.0,45967,2752506.0,28517 ,,9624859.0,30462,2480667.0,27055 2.7,5.5,9883943.0,61851,2618326.0,36656 4.5,15.4,10057320.0,46352,2863788.0,34022 ,,9739896.0,35436,2476666.0,30301 3.1,8.2,10043706.0,60944,2680570.0,42346 4.6,16.8,10191082.0,51348,2893385.0,32717 ,,9833955.0,39366,2628480.0,36567 3.5,2.7,10180871.0,50050,2699941.0,42805 5.0,7.3,10323136.0,50768,2820628.0,30552 ,,9908832.0,20826,2666415.0,51379 3.5,0.5,10251385.0,58551,2679925.0,49144 5.1,5.7,10418155.0,51726,2817192.0,34043 ,,9969311.0,20378,2563399.0,36720 3.5,4.8,10314449.0,60867,2686176.0,42926 5.4,9.1,10504881.0,53101,2796816.0,37461 ,,10077169.0,36182,2584728.0,32672 3.1,7.4,10393453.0,63048,2775523.0,39745 4.7,11.9,10549870.0,45281,2893000.0,39102 ,,10115997.0,25835,2653036.0,33259 2.7,5.7,10388901.0,63402,2803290.0,36021 4.6,11.2,10580796.0,63517,2949834.0,31422 ,,10162757.0,33119,2681195.0,30592 2.5,3.0,10413010.0,76720,2761752.0,32472 4.0,9.5,10568061.0,65614,2935127.0,38463 ,,10223472.0,41882,2670421.0,26049 2.4,5.0,10470977.0,58009,2803478.0,37111 4.1,7.4,10646450.0,54810,2868986.0,52724
kernel (#) ntask proc thr proc stdev thr stdev speedup speedup pgf/s pgf/s 4.16-rc5 (1) 1 586,305 747 587,731 1,766 lu-zone (2) 1 4.0% 3.4% 609,505 1,562 608,007 1,169 4.16-rc5-nz (3) 1 8.0% 5.9% 633,145 1,752 622,690 1,286 4.16-rc5 (1) 2 1,131,428 7,396 1,022,890 7,333 lu-zone (2) 2 -1.0% -0.3% 1,119,974 2,557 1,020,102 5,706 4.16-rc5-nz (3) 2 3.0% 3.3% 1,165,004 6,232 1,056,689 6,411 4.16-rc5 (1) 3 1,590,413 6,411 1,346,900 8,924 lu-zone (2) 3 -0.7% 1.0% 1,579,816 7,216 1,360,376 4,418 4.16-rc5-nz (3) 3 2.3% 3.4% 1,626,925 5,321 1,392,515 8,180 4.16-rc5 (1) 4 2,064,476 5,034 1,656,475 14,713 lu-zone (2) 4 -1.0% 0.9% 2,043,240 9,069 1,672,036 5,342 4.16-rc5-nz (3) 4 1.2% 3.9% 2,089,959 8,287 1,721,614 7,796 4.16-rc5 (1) 5 2,486,090 11,178 1,878,085 15,286 lu-zone (2) 5 -0.1% 1.1% 2,483,021 15,100 1,898,459 9,295 4.16-rc5-nz (3) 5 1.3% 4.0% 2,517,602 13,717 1,952,995 7,481 4.16-rc5 (1) 6 2,869,756 9,194 2,058,398 20,580 lu-zone (2) 6 0.4% 3.5% 2,882,220 20,583 2,129,444 9,689 4.16-rc5-nz (3) 6 2.4% 6.0% 2,937,618 14,126 2,182,859 9,650 4.16-rc5 (1) 7 3,242,589 15,354 2,231,977 20,188 lu-zone (2) 7 0.9% 3.2% 3,270,796 15,124 2,303,780 6,607 4.16-rc5-nz (3) 7 2.0% 6.7% 3,306,528 17,683 2,381,279 16,507 4.16-rc5 (1) 8 3,598,209 10,764 2,361,819 13,508 lu-zone (2) 8 1.2% 4.5% 3,642,407 14,893 2,469,191 16,250 4.16-rc5-nz (3) 8 2.0% 8.1% 3,671,834 17,501 2,552,786 12,112 4.16-rc5 (1) 9 3,974,345 12,605 2,511,565 31,986 lu-zone (2) 9 2.3% 4.8% 4,067,553 11,069 2,632,608 8,158 4.16-rc5-nz (3) 9 2.5% 9.8% 4,073,111 12,463 2,758,075 31,432 4.16-rc5 (1) 10 4,333,026 12,187 2,636,914 15,064 lu-zone (2) 10 2.4% 5.8% 4,435,852 21,691 2,789,949 16,399 4.16-rc5-nz (3) 10 3.2% 10.3% 4,470,666 23,663 2,907,263 15,052 4.16-rc5 (1) 11 4,932,423 12,183 2,675,769 23,924 lu-zone (2) 11 2.7% 3.6% 5,064,666 18,600 2,771,476 22,438 4.16-rc5-nz (3) 11 3.6% 7.9% 5,110,434 21,459 2,888,419 18,180 4.16-rc5 (1) 12 5,461,255 14,704 2,631,232 24,390 lu-zone (2) 12 1.7% 2.5% 5,554,957 20,978 2,697,554 19,369 4.16-rc5-nz (3) 12 3.1% 5.8% 5,629,143 22,781 2,782,902 20,346 4.16-rc5 (1) 13 5,924,367 11,835 2,445,607 25,821 lu-zone (2) 13 1.4% 4.9% 6,004,723 19,070 2,566,547 26,031 4.16-rc5-nz (3) 13 2.9% 8.4% 6,094,087 17,676 2,651,793 20,050 4.16-rc5 (1) 14 6,381,611 16,791 2,277,611 39,557 lu-zone (2) 14 1.2% 4.4% 6,459,693 18,869 2,377,795 25,093 4.16-rc5-nz (3) 14 2.4% 12.4% 6,534,837 24,990 2,560,085 12,638 4.16-rc5 (1) 15 6,804,737 13,653 2,232,121 19,408 lu-zone (2) 15 1.1% 4.8% 6,881,730 18,995 2,338,868 18,922 4.16-rc5-nz (3) 15 2.4% 12.5% 6,970,318 28,677 2,510,594 25,890 4.16-rc5 (1) 16 7,197,145 17,499 2,313,168 16,694 lu-zone (2) 16 1.2% 4.5% 7,287,072 28,727 2,418,232 23,120 4.16-rc5-nz (3) 16 2.6% 7.6% 7,383,613 17,991 2,489,382 28,385 4.16-rc5 (1) 17 7,550,498 15,101 2,226,427 24,768 lu-zone (2) 17 1.6% 4.4% 7,667,641 24,305 2,324,675 16,265 4.16-rc5-nz (3) 17 2.8% 5.3% 7,761,917 29,854 2,345,195 20,659 4.16-rc5 (1) 18 7,902,794 12,578 2,188,399 37,453 lu-zone (2) 18 1.7% 6.0% 8,033,876 21,158 2,320,641 13,053 4.16-rc5-nz (3) 18 2.8% 8.9% 8,126,732 27,619 2,383,083 25,172 4.16-rc5 (1) 19 8,277,506 15,448 2,198,021 33,074 lu-zone (2) 19 2.1% 6.4% 8,453,255 17,411 2,339,395 20,220 4.16-rc5-nz (3) 19 3.0% 7.7% 8,529,130 29,852 2,366,800 20,139 4.16-rc5 (1) 20 8,651,034 19,705 2,239,626 25,694 lu-zone (2) 20 2.2% 3.2% 8,840,988 24,387 2,311,966 25,693 4.16-rc5-nz (3) 20 3.1% 9.3% 8,918,721 34,761 2,448,589 26,227 4.16-rc5 (1) 21 8,777,023 11,833 2,259,622 32,154 lu-zone (2) 21 2.5% 5.7% 8,993,348 25,480 2,389,000 29,464 4.16-rc5-nz (3) 21 3.5% 6.3% 9,085,319 27,790 2,401,713 28,706 4.16-rc5 (1) 22 8,855,202 27,455 2,268,030 35,914 lu-zone (2) 22 3.0% 4.1% 9,123,705 34,288 2,361,876 31,917 4.16-rc5-nz (3) 22 4.2% 11.9% 9,228,843 33,483 2,536,867 26,000 4.16-rc5 (1) 23 8,952,897 21,601 2,280,539 30,530 lu-zone (2) 23 3.5% 10.1% 9,268,365 30,803 2,510,883 26,312 4.16-rc5-nz (3) 23 4.6% 10.3% 9,367,048 30,681 2,514,740 32,896 4.16-rc5 (1) 24 9,036,582 19,482 2,374,728 42,891 lu-zone (2) 24 3.7% 2.2% 9,369,541 33,253 2,425,993 45,952 4.16-rc5-nz (3) 24 5.0% 4.5% 9,489,640 28,066 2,482,179 33,820 4.16-rc5 (1) 25 9,136,041 18,232 2,336,090 30,036 lu-zone (2) 25 4.0% 6.1% 9,497,501 34,408 2,478,602 31,959 4.16-rc5-nz (3) 25 4.7% 10.5% 9,563,832 34,506 2,581,056 38,516 4.16-rc5 (1) 26 9,226,630 17,998 2,326,070 33,782 lu-zone (2) 26 3.7% 7.8% 9,570,547 27,052 2,508,396 41,848 4.16-rc5-nz (3) 26 4.4% 10.4% 9,634,192 40,217 2,566,842 31,115 4.16-rc5 (1) 27 9,305,784 24,573 2,391,261 29,547 lu-zone (2) 27 3.8% 6.1% 9,656,252 39,164 2,536,624 45,738 4.16-rc5-nz (3) 27 4.5% 5.2% 9,720,210 31,171 2,516,447 32,347 4.16-rc5 (1) 28 9,381,004 19,377 2,442,774 35,745 lu-zone (2) 28 2.6% 1.3% 9,626,125 65,187 2,474,560 29,977 4.16-rc5-nz (3) 28 4.1% 4.4% 9,766,045 54,298 2,549,227 31,199 4.16-rc5 (1) 29 9,401,844 27,746 2,456,372 40,549 lu-zone (2) 29 2.7% 3.8% 9,652,161 51,629 2,549,004 39,990 4.16-rc5-nz (3) 29 3.5% 6.9% 9,731,681 51,852 2,625,589 27,821 4.16-rc5 (1) 30 9,428,320 17,561 2,509,119 39,752 lu-zone (2) 30 2.1% 6.1% 9,630,472 50,106 2,662,447 44,347 4.16-rc5-nz (3) 30 3.1% 5.7% 9,722,152 50,348 2,651,891 28,518 4.16-rc5 (1) 31 9,561,062 21,909 2,392,883 24,180 lu-zone (2) 31 2.0% 7.1% 9,755,774 60,381 2,563,573 32,132 4.16-rc5-nz (3) 31 3.5% 15.0% 9,894,735 45,966 2,752,506 28,516 4.16-rc5 (1) 32 9,624,859 30,462 2,480,667 27,055 lu-zone (2) 32 2.7% 5.5% 9,883,943 61,850 2,618,326 36,655 4.16-rc5-nz (3) 32 4.5% 15.4% 10,057,320 46,352 2,863,788 34,021 4.16-rc5 (1) 33 9,739,896 35,435 2,476,666 30,301 lu-zone (2) 33 3.1% 8.2% 10,043,706 60,943 2,680,570 42,346 4.16-rc5-nz (3) 33 4.6% 16.8% 10,191,082 51,348 2,893,385 32,717 4.16-rc5 (1) 34 9,833,955 39,366 2,628,480 36,567 lu-zone (2) 34 3.5% 2.7% 10,180,871 50,050 2,699,941 42,804 4.16-rc5-nz (3) 34 5.0% 7.3% 10,323,136 50,767 2,820,628 30,551 4.16-rc5 (1) 35 9,908,832 20,826 2,666,415 51,379 lu-zone (2) 35 3.5% 0.5% 10,251,385 58,551 2,679,925 49,143 4.16-rc5-nz (3) 35 5.1% 5.7% 10,418,155 51,726 2,817,192 34,042 4.16-rc5 (1) 36 9,969,311 20,377 2,563,399 36,720 lu-zone (2) 36 3.5% 4.8% 10,314,449 60,867 2,686,176 42,925 4.16-rc5-nz (3) 36 5.4% 9.1% 10,504,881 53,100 2,796,816 37,461 4.16-rc5 (1) 37 10,077,169 36,182 2,584,728 32,672 lu-zone (2) 37 3.1% 7.4% 10,393,453 63,048 2,775,523 39,745 4.16-rc5-nz (3) 37 4.7% 11.9% 10,549,870 45,280 2,893,000 39,102 4.16-rc5 (1) 38 10,115,997 25,835 2,653,036 33,259 lu-zone (2) 38 2.7% 5.7% 10,388,901 63,402 2,803,290 36,020 4.16-rc5-nz (3) 38 4.6% 11.2% 10,580,796 63,516 2,949,834 31,421 4.16-rc5 (1) 39 10,162,757 33,118 2,681,195 30,591 lu-zone (2) 39 2.5% 3.0% 10,413,010 76,719 2,761,752 32,471 4.16-rc5-nz (3) 39 4.0% 9.5% 10,568,061 65,614 2,935,127 38,463 4.16-rc5 (1) 40 10,223,472 41,882 2,670,421 26,049 lu-zone (2) 40 2.4% 5.0% 10,470,977 58,008 2,803,478 37,111 4.16-rc5-nz (3) 40 4.1% 7.4% 10,646,450 54,810 2,868,986 52,724