Re: [tip: sched/core] sched/fair: Multi-LLC select_idle_sibling()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jun 08, 2023 at 12:02:15AM +0530, K Prateek Nayak wrote:
> Hello Peter,
> 
> Below are the benchmark results on different NPS modes for SIS_NODE
> and SIS_NODE + additional suggested changes. None of them give a
> total win. Limit helps but there are cases where it still leads to
> regression. I'll leave full details below.
> 
> On 6/2/2023 12:24 PM, Peter Zijlstra wrote:
> > On Fri, Jun 02, 2023 at 10:43:37AM +0530, K Prateek Nayak wrote:
> >> Grouping near-CCX for the offerings that do not have 2CCX per CCD will
> >> prevent degenration and limit the search scope yes. Here is what I'll
> >> do, let me check if limiting search scope helps first, and then start
> >> fiddling with the topology. How does that sound?
> > 
> > So my preference would be the topology based solution, since the search
> > limit is random magic numbers that happen to work for 'your' machine but
> > who knows what it'll do for some other poor architecture that happens to
> > trip this.
> > 
> > That said; verifying the limit helps at all is of course a good start,
> > because if it doesn't then the topology thing will likely also not help
> > much.
> 
> o NPS Modes
> 
> NPS Modes are used to logically divide single socket into
> multiple NUMA region.
> Following is the NUMA configuration for each NPS mode on the system:
> 
> NPS1: Each socket is a NUMA node.
>     Total 2 NUMA nodes in the dual socket machine.
> 
>     Node 0: 0-63,   128-191
>     Node 1: 64-127, 192-255
> 
>     - 8CCX per node

Ok, so this is a dual-socket Zen3 with 64 cores per socket, right?


> o Kernel Versions
> 
> - tip              - tip:sched/core at commit e2a1f85bf9f5 "sched/psi:
>                      Avoid resetting the min update period when it is
>                      unnecessary")
> 
> - SIS_NODE         - tip:sched/core + this patch
> 
> - SIS_NODE_LIMIT   - tip:sched/core + this patch + nr=4 limit for SIS_NODE
> 		     (https://lore.kernel.org/all/20230601111326.GV4253@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/)
> 
> - SIS_NODE_TOPOEXT - tip:sched/core + this patch
>                      + new sched domain (Multi-Multi-Core or MMC)
> 		     (https://lore.kernel.org/all/20230601153522.GB559993@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/)
> 		     MMC domain groups 2 nearby CCX.

OK, so you managed to get the NPS4 topology in NPS1 mode?

> o Benchmark Results
> 
> Note: All benchmarks were run with boost enabled and C2 disabled.
> 
> ~~~~~~~~~~~~~
> ~ hackbench ~
> ~~~~~~~~~~~~~
> 
> o NPS1
> 
> Test:                   tip                     SIS_NODE           SIS_NODE_LIMIT          SIS_NODE_TOPOEXT
>  1-groups:         3.92 (0.00 pct)         4.05 (-3.31 pct)        3.78 (3.57 pct)         3.77 (3.82 pct)
>  2-groups:         4.58 (0.00 pct)         3.84 (16.15 pct)        4.50 (1.74 pct)         4.34 (5.24 pct)
>  4-groups:         4.99 (0.00 pct)         3.98 (20.24 pct)        4.93 (1.20 pct)         5.01 (-0.40 pct)
>  8-groups:         5.67 (0.00 pct)         6.05 (-6.70 pct)        5.73 (-1.05 pct)        5.95 (-4.93 pct)
> 16-groups:         7.88 (0.00 pct)        10.56 (-34.01 pct)       7.83 (0.63 pct)         8.04 (-2.03 pct)
> 
> o NPS2
> 
> Test:                   tip                     SIS_NODE           SIS_NODE_LIMIT          SIS_NODE_TOPOEXT
>  1-groups:         3.82 (0.00 pct)         3.68 (3.66 pct)         3.87 (-1.30 pct)        3.74 (2.09 pct)
>  2-groups:         4.40 (0.00 pct)         3.61 (17.95 pct)        4.45 (-1.13 pct)        4.30 (2.27 pct)
>  4-groups:         4.84 (0.00 pct)         3.62 (25.20 pct)        4.84 (0.00 pct)         4.97 (-2.68 pct)
>  8-groups:         5.45 (0.00 pct)         6.14 (-12.66 pct)       5.40 (0.91 pct)         5.68 (-4.22 pct)
> 16-groups:         6.94 (0.00 pct)         8.77 (-26.36 pct)       6.57 (5.33 pct)         7.87 (-13.40 pct)
> 
> o NPS4
> 
> Test:                   tip                     SIS_NODE           SIS_NODE_LIMIT          SIS_NODE_TOPOEXT
>  1-groups:         3.82 (0.00 pct)         3.84 (-0.52 pct)        3.83 (-0.26 pct)        3.85 (-0.78 pct)
>  2-groups:         4.44 (0.00 pct)         4.15 (6.53 pct)         4.43 (0.22 pct)         4.18 (5.85 pct)
>  4-groups:         4.86 (0.00 pct)         4.95 (-1.85 pct)        4.88 (-0.41 pct)        4.79 (1.44 pct)
>  8-groups:         5.42 (0.00 pct)         5.80 (-7.01 pct)        5.41 (0.18 pct)         5.75 (-6.08 pct)
> 16-groups:         6.68 (0.00 pct)         9.07 (-35.77 pct)       6.72 (-0.59 pct)        8.66 (-29.64 pct)

Win for NODE_LIMIT for having the least regressions, but also no real
gains.

Given NODE_TOPO does NPS4 that should be roughtly similar to limit=2 it
should do 'better' but it doesn't, it's markedly worse... weird.

In fact, none of the NPS4 numbers make any sense, if you've already
split the whole thing into 4, you remain with 2 CCXs per node and
NODE should be NODE_LIMIT should be NODE_TOPO.

All the NODE variants should end up scanning both CCXs and performance
should really be the same.

Something's wrong there.


> ~~~~~~~~~~
> ~ tbench ~
> ~~~~~~~~~~
> 
> o NPS1
> 
> Clients:      tip                     SIS_NODE             SIS_NODE_LIMIT         SIS_NODE_TOPOEXT
>     1    452.49 (0.00 pct)       457.94 (1.20 pct)       458.13 (1.24 pct)       447.69 (-1.06 pct)
>     2    862.44 (0.00 pct)       879.99 (2.03 pct)       881.19 (2.17 pct)       855.91 (-0.75 pct)
>     4    1604.27 (0.00 pct)      1618.87 (0.91 pct)      1628.00 (1.47 pct)      1627.14 (1.42 pct)
>     8    2966.77 (0.00 pct)      3040.90 (2.49 pct)      3037.70 (2.39 pct)      2957.91 (-0.29 pct)
>    16    5176.70 (0.00 pct)      5292.29 (2.23 pct)      5445.15 (5.18 pct)      5241.61 (1.25 pct)
>    32    8205.24 (0.00 pct)      8949.12 (9.06 pct)      8716.02 (6.22 pct)      8494.17 (3.52 pct)
>    64    13956.71 (0.00 pct)     14461.42 (3.61 pct)     13620.04 (-2.41 pct)    15045.43 (7.80 pct)
>   128    24005.50 (0.00 pct)     26052.75 (8.52 pct)     24975.03 (4.03 pct)     24008.73 (0.01 pct)
>   256    32457.61 (0.00 pct)     21999.41 (-32.22 pct)   30810.93 (-5.07 pct)    31060.12 (-4.30 pct)
>   512    34345.24 (0.00 pct)     41166.39 (19.86 pct)    30982.94 (-9.78 pct)    31864.14 (-7.22 pct)
>  1024    33432.92 (0.00 pct)     40900.84 (22.33 pct)    30953.61 (-7.41 pct)    32006.81 (-4.26 pct)
> 
> o NPS2
> 
> Clients:      tip                     SIS_NODE             SIS_NODE_LIMIT         SIS_NODE_TOPOEXT
>     1    453.73 (0.00 pct)       451.63 (-0.46 pct)      455.97 (0.49 pct)       453.79 (0.01 pct)
>     2    861.71 (0.00 pct)       857.85 (-0.44 pct)      868.30 (0.76 pct)       850.14 (-1.34 pct)
>     4    1599.14 (0.00 pct)      1609.30 (0.63 pct)      1656.08 (3.56 pct)      1619.10 (1.24 pct)
>     8    2951.03 (0.00 pct)      2944.71 (-0.21 pct)     3034.38 (2.82 pct)      2973.52 (0.76 pct)
>    16    5080.32 (0.00 pct)      5160.39 (1.57 pct)      5173.32 (1.83 pct)      5150.99 (1.39 pct)
>    32    7900.41 (0.00 pct)      8039.13 (1.75 pct)      8105.69 (2.59 pct)      7956.45 (0.70 pct)
>    64    14629.65 (0.00 pct)     15391.08 (5.20 pct)     14546.09 (-0.57 pct)    15410.41 (5.33 pct)
>   128    23155.88 (0.00 pct)     24015.45 (3.71 pct)     24263.82 (4.78 pct)     23351.35 (0.84 pct)
>   256    33449.57 (0.00 pct)     33571.08 (0.36 pct)     32048.20 (-4.18 pct)    32869.85 (-1.73 pct)
>   512    33757.47 (0.00 pct)     39872.69 (18.11 pct)    32945.66 (-2.40 pct)    34526.17 (2.27 pct)
>  1024    34823.14 (0.00 pct)     41090.15 (17.99 pct)    32404.40 (-6.94 pct)    34522.97 (-0.86 pct)
> 
> o NPS4
> 
> Clients:      tip                     SIS_NODE             SIS_NODE_LIMIT         SIS_NODE_TOPOEXT
>     1    450.14 (0.00 pct)       454.46 (0.95 pct)       454.53 (0.97 pct)       451.43 (0.28 pct)
>     2    863.26 (0.00 pct)       868.94 (0.65 pct)       891.89 (3.31 pct)       866.74 (0.40 pct)
>     4    1618.71 (0.00 pct)      1599.13 (-1.20 pct)     1630.29 (0.71 pct)      1610.08 (-0.53 pct)
>     8    2929.35 (0.00 pct)      3065.12 (4.63 pct)      3064.15 (4.60 pct)      3004.74 (2.57 pct)
>    16    5114.04 (0.00 pct)      5261.40 (2.88 pct)      5238.04 (2.42 pct)      5108.53 (-0.10 pct)
>    32    7912.18 (0.00 pct)      8926.77 (12.82 pct)     8382.51 (5.94 pct)      8214.73 (3.82 pct)
>    64    14424.72 (0.00 pct)     14853.61 (2.97 pct)     14273.54 (-1.04 pct)    14430.17 (0.03 pct)
>   128    23614.97 (0.00 pct)     24506.73 (3.77 pct)     24517.76 (3.82 pct)     23296.38 (-1.34 pct)
>   256    34365.13 (0.00 pct)     35538.42 (3.41 pct)     31909.66 (-7.14 pct)    31009.12 (-9.76 pct)
>   512    34215.50 (0.00 pct)     36017.49 (5.26 pct)     32696.70 (-4.43 pct)    33262.55 (-2.78 pct)
>  1024    35421.90 (0.00 pct)     35193.81 (-0.64 pct)    32611.10 (-7.93 pct)    32795.86 (-7.41 pct)

tbench likes NODE

> ~~~~~~~~~~
> ~ stream ~
> ~~~~~~~~~~
> 
> - 10 Runs
> 
> o NPS1
> 
> Test:         tip                     SIS_NODE             SIS_NODE_LIMIT          SIS_NODE_TOPOEXT
>  Copy:   271317.35 (0.00 pct)    292440.22 (7.78 pct)    302540.26 (11.50 pct)   287277.25 (5.88 pct)
> Scale:   205533.77 (0.00 pct)    203362.60 (-1.05 pct)   207750.30 (1.07 pct)    205206.26 (-0.15 pct)
>   Add:   221624.62 (0.00 pct)    225850.83 (1.90 pct)    233782.14 (5.48 pct)    229774.48 (3.67 pct)
> Triad:   228500.68 (0.00 pct)    225885.25 (-1.14 pct)   238331.69 (4.30 pct)    240041.53 (5.05 pct)
> 
> o NPS2
> 
> Test:         tip                     SIS_NODE             SIS_NODE_LIMIT          SIS_NODE_TOPOEXT
>  Copy:   277761.29 (0.00 pct)    301816.34 (8.66 pct)    293563.58 (5.68 pct)    308218.80 (10.96 pct)
> Scale:   215193.83 (0.00 pct)    212522.72 (-1.24 pct)   215758.66 (0.26 pct)    205678.94 (-4.42 pct)
>   Add:   242725.75 (0.00 pct)    242695.13 (-0.01 pct)   246472.20 (1.54 pct)    238089.46 (-1.91 pct)
> Triad:   237253.44 (0.00 pct)    250618.57 (5.63 pct)    239405.55 (0.90 pct)    249652.73 (5.22 pct)
> 
> o NPS4
> 
> Test:         tip                     SIS_NODE             SIS_NODE_LIMIT          SIS_NODE_TOPOEXT
>  Copy:   273307.14 (0.00 pct)    255091.78 (-6.66 pct)   301926.68 (10.47 pct)   262007.26 (-4.13 pct)
> Scale:   235715.23 (0.00 pct)    222018.36 (-5.81 pct)   224881.52 (-4.59 pct)   222282.64 (-5.69 pct)
>   Add:   244500.40 (0.00 pct)    230468.21 (-5.73 pct)   242625.18 (-0.76 pct)   227146.80 (-7.09 pct)
> Triad:   250600.04 (0.00 pct)    236229.50 (-5.73 pct)   258064.49 (2.97 pct)    231772.02 (-7.51 pct)
> 
> - 100 Runs
> 
> Test:         tip                     SIS_NODE             SIS_NODE_LIMIT          SIS_NODE_TOPOEXT
>  Copy:   317381.65 (0.00 pct)    318827.08 (0.45 pct)    320898.32 (1.10 pct)    318922.96 (0.48 pct)
> Scale:   214145.00 (0.00 pct)    206213.69 (-3.70 pct)   211019.12 (-1.45 pct)   210384.47 (-1.75 pct)
>   Add:   239243.29 (0.00 pct)    229791.67 (-3.95 pct)   233827.11 (-2.26 pct)   236659.48 (-1.07 pct)
> Triad:   249477.76 (0.00 pct)    236843.06 (-5.06 pct)   244688.91 (-1.91 pct)   235990.67 (-5.40 pct)
> 
> o NPS2
> 
> Test:         tip                     SIS_NODE             SIS_NODE_LIMIT          SIS_NODE_TOPOEXT
>  Copy:   318082.10 (0.00 pct)    322844.91 (1.49 pct)    310350.21 (-2.43 pct)   322495.84 (1.38 pct)
> Scale:   219338.56 (0.00 pct)    218139.90 (-0.54 pct)   212288.47 (-3.21 pct)   221040.27 (0.77 pct)
>   Add:   248118.20 (0.00 pct)    249826.98 (0.68 pct)    239682.55 (-3.39 pct)   253006.79 (1.97 pct)
> Triad:   247088.55 (0.00 pct)    260488.38 (5.42 pct)    247892.42 (0.32 pct)    249081.33 (0.80 pct)
> 
> o NPS4
> 
> Test:         tip                     SIS_NODE             SIS_NODE_LIMIT          SIS_NODE_TOPOEXT
>  Copy:   345396.19 (0.00 pct)    343675.74 (-0.49 pct)   346990.96 (0.46 pct)    334677.55 (-3.10 pct)
> Scale:   241521.63 (0.00 pct)    231494.70 (-4.15 pct)   236233.18 (-2.18 pct)   229159.01 (-5.11 pct)
>   Add:   261157.86 (0.00 pct)    249663.86 (-4.40 pct)   253402.85 (-2.96 pct)   242257.98 (-7.23 pct)
> Triad:   267804.99 (0.00 pct)    263071.00 (-1.76 pct)   264208.15 (-1.34 pct)   256978.50 (-4.04 pct)

Again, the NPS4 reults are weird.

> ~~~~~~~~~~~
> ~ netperf ~
> ~~~~~~~~~~~
> 
> o NPS1
> 
>                         tip                  SIS_NODE              SIS_NODE_LIMIT         SIS_NODE_TOPOEXT
> 1-clients:       102839.97 (0.00 pct)    103540.33 (0.68 pct)    103769.74 (0.90 pct)    103271.77 (0.41 pct)
> 2-clients:       98428.08 (0.00 pct)     100431.67 (2.03 pct)    100555.62 (2.16 pct)    100417.11 (2.02 pct)
> 4-clients:       92298.45 (0.00 pct)     94800.51 (2.71 pct)     93706.09 (1.52 pct)     94981.10 (2.90 pct)
> 8-clients:       85618.41 (0.00 pct)     89130.14 (4.10 pct)     87677.84 (2.40 pct)     88284.61 (3.11 pct)
> 16-clients:      78722.18 (0.00 pct)     79715.38 (1.26 pct)     80488.76 (2.24 pct)     78980.88 (0.32 pct)
> 32-clients:      73610.75 (0.00 pct)     72801.41 (-1.09 pct)    72167.43 (-1.96 pct)    75077.55 (1.99 pct)
> 64-clients:      55285.07 (0.00 pct)     56184.38 (1.62 pct)     56443.79 (2.09 pct)     60689.05 (9.77 pct)
> 128-clients:     31176.92 (0.00 pct)     32830.06 (5.30 pct)     35511.93 (13.90 pct)    35638.50 (14.31 pct)
> 256-clients:     20011.44 (0.00 pct)     15135.39 (-24.36 pct)   17599.21 (-12.05 pct)   18219.29 (-8.95 pct)
> 
> o NPS2
> 
>                         tip                  SIS_NODE              SIS_NODE_LIMIT         SIS_NODE_TOPOEXT
> 1-clients:       103105.55 (0.00 pct)    101582.75 (-1.47 pct)   103077.22 (-0.02 pct)   102233.63 (-0.84 pct)
> 2-clients:       98720.29 (0.00 pct)     98537.46 (-0.18 pct)    100761.54 (2.06 pct)    99211.39 (0.49 pct)
> 4-clients:       92289.39 (0.00 pct)     94332.45 (2.21 pct)     93622.46 (1.44 pct)     93321.77 (1.11 pct)
> 8-clients:       84998.63 (0.00 pct)     87180.90 (2.56 pct)     86970.84 (2.32 pct)     86076.75 (1.26 pct)
> 16-clients:      76395.81 (0.00 pct)     80017.06 (4.74 pct)     77937.29 (2.01 pct)     75090.85 (-1.70 pct)
> 32-clients:      71110.89 (0.00 pct)     69445.86 (-2.34 pct)    69273.81 (-2.58 pct)    66885.99 (-5.94 pct)
> 64-clients:      49526.21 (0.00 pct)     50004.13 (0.96 pct)     51649.09 (4.28 pct)     51100.52 (3.17 pct)
> 128-clients:     27917.51 (0.00 pct)     30581.70 (9.54 pct)     31587.40 (13.14 pct)    33477.65 (19.91 pct)
> 256-clients:     20067.17 (0.00 pct)     26002.42 (29.57 pct)    18681.28 (-6.90 pct)    18144.96 (-9.57 pct)
> 
> o NPS4
> 
>                         tip                  SIS_NODE              SIS_NODE_LIMIT         SIS_NODE_TOPOEXT
> 1-clients:       102139.49 (0.00 pct)    103578.02 (1.40 pct)    103633.90 (1.46 pct)    101656.07 (-0.47 pct)
> 2-clients:       98259.53 (0.00 pct)     99336.70 (1.09 pct)     99720.37 (1.48 pct)     98812.86 (0.56 pct)
> 4-clients:       91576.79 (0.00 pct)     95278.30 (4.04 pct)     93688.37 (2.30 pct)     93848.94 (2.48 pct)
> 8-clients:       84742.30 (0.00 pct)     89005.65 (5.03 pct)     87703.04 (3.49 pct)     86709.29 (2.32 pct)
> 16-clients:      79540.75 (0.00 pct)     85478.97 (7.46 pct)     83195.92 (4.59 pct)     81016.24 (1.85 pct)
> 32-clients:      71166.14 (0.00 pct)     74254.01 (4.33 pct)     72422.76 (1.76 pct)     71391.62 (0.31 pct)
> 64-clients:      51763.24 (0.00 pct)     52565.56 (1.54 pct)     55159.65 (6.56 pct)     52472.91 (1.37 pct)
> 128-clients:     27829.29 (0.00 pct)     35774.61 (28.55 pct)    33738.97 (21.23 pct)    34564.10 (24.20 pct)
> 256-clients:     24185.37 (0.00 pct)     27215.35 (12.52 pct)    17675.87 (-26.91 pct)   24937.66 (3.11 pct)

NPS4 is weird again, but mostly wins.

Based on the NPS1 results I'd say this one goes to TOPO

> ~~~~~~~~~~~~~~~~
> ~ ycsb-mongodb ~
> ~~~~~~~~~~~~~~~~
> 
> o NPS1
> 
> tip:			131070.33 (var: 2.84%)
> SIS_NODE:		131070.33 (var: 2.84%) (0.00%)
> SIS_NODE_LIMIT:		137227.00 (var: 4.97%) (4.69%)
> SIS_NODE_TOPOEXT:	133529.67 (var: 0.98%) (1.87%)
> 
> o NPS2
> 
> tip:			133693.67 (var: 1.69%)
> SIS_NODE:		134173.00 (var: 4.07%) (0.35%)
> SIS_NODE_LIMIT:		134124.67 (var: 2.20%) (0.32%)
> SIS_NODE_TOPOEXT:	133747.33 (var: 2.49%) (0.04%)
> 
> o NPS4
> 
> tip:			132913.67 (var: 1.97%)
> SIS_NODE:		133697.33 (var: 1.69%) (0.58%)
> SIS_NODE_LIMIT:		133307.33 (var: 1.03%) (0.29%)
> SIS_NODE_TOPOEXT:	133426.67 (var: 3.60%) (0.38%)
> 
> ~~~~~~~~~~~~~
> ~ unixbench ~
> ~~~~~~~~~~~~~
> 
> o NPS1
> 
> kernel                        			tip                  SIS_NODE               SIS_NODE_LIMIT            SIS_NODE_TOPOEXT
> Hmean     unixbench-dhry2reg-1   	  41322625.19 (   0.00%)    41224388.33 (  -0.24%)    41142898.66 (  -0.43%)    41222168.97 (  -0.24%)
> Hmean     unixbench-dhry2reg-512	6252491108.60 (   0.00%)  6240160851.68 (  -0.20%)  6262714194.10 (   0.16%)  6259553403.67 (   0.11%)
> Amean     unixbench-syscall-1    	   2501398.27 (   0.00%)    2577323.43 *  -3.04%*      2498697.20 (   0.11%)     2541279.77 *  -1.59%*
> Amean     unixbench-syscall-512  	   8120524.00 (   0.00%)    7512955.87 *   7.48%*      7447849.67 *   8.28%*     7477129.17 *   7.92%*
> Hmean     unixbench-pipe-1    		   2359346.02 (   0.00%)    2392308.62 *   1.40%*      2407625.04 *   2.05%*     2334146.94 *  -1.07%*
> Hmean     unixbench-pipe-512		 338790322.61 (   0.00%)  337711432.92 (  -0.32%)    340399941.24 (   0.48%)   339008490.26 (   0.06%)
> Hmean     unixbench-spawn-1       	      4261.52 (   0.00%)       4164.90 (  -2.27%)         4929.26 *  15.67%*        5111.16 *  19.94%*
> Hmean     unixbench-spawn-512    	     64328.93 (   0.00%)      62257.64 *  -3.22%*        63740.04 *  -0.92%*       63291.18 *  -1.61%*
> Hmean     unixbench-execl-1       	      3677.73 (   0.00%)       3652.08 (  -0.70%)         3642.56 *  -0.96%*        3671.98 (  -0.16%)
> Hmean     unixbench-execl-512    	     11984.83 (   0.00%)      13585.65 *  13.36%*        12496.80 (   4.27%)       12306.01 (   2.68%)
> 
> o NPS2
> 
> kernel                        			tip                  SIS_NODE               SIS_NODE_LIMIT            SIS_NODE_TOPOEXT
> Hmean     unixbench-dhry2reg-1   	  41311787.29 (   0.00%)    41412946.27 (   0.24%)    41035150.98 (  -0.67%)    41371003.93 (   0.14%)
> Hmean     unixbench-dhry2reg-512	6243873272.76 (   0.00%)  6256893083.32 (   0.21%)  6236751880.89 (  -0.11%)  6235047089.83 (  -0.14%)
> Amean     unixbench-syscall-1    	   2503190.70 (   0.00%)     2576854.30 *  -2.94%*     2496464.80 *   0.27%*     2540298.77 *  -1.48%*
> Amean     unixbench-syscall-512  	   8012388.13 (   0.00%)     7503196.87 *   6.36%*     7493284.60 *   6.48%*     7495117.73 *   6.46%*
> Hmean     unixbench-pipe-1    		   2340486.25 (   0.00%)     2388946.63 (   2.07%)     2412344.33 *   3.07%*     2360277.30 (   0.85%)
> Hmean     unixbench-pipe-512		 338965319.79 (   0.00%)   337225630.07 (  -0.51%)   339053027.04 (   0.03%)   336939353.18 *  -0.60%*
> Hmean     unixbench-spawn-1      	      5241.83 (   0.00%)        5246.00 (   0.08%)        4718.45 *  -9.98%*        4967.96 *  -5.22%*
> Hmean     unixbench-spawn-512    	     65799.86 (   0.00%)       64817.15 *  -1.49%*       66418.37 (   0.94%)       66820.63 *   1.55%*
> Hmean     unixbench-execl-1       	      3670.65 (   0.00%)        3622.36 *  -1.32%*        3661.04 (  -0.26%)        3660.08 (  -0.29%)
> Hmean     unixbench-execl-512    	     13682.00 (   0.00%)       13699.90 (   0.13%)       14103.91 (   3.08%)       12960.11 (  -5.28%)
> 
> o NPS4
> 
> kernel                        			tip                  SIS_NODE               SIS_NODE_LIMIT            SIS_NODE_TOPOEXT
> Hmean     unixbench-dhry2reg-1   	  41025577.99 (   0.00%)    40879469.78 (  -0.36%)    41082700.61 (   0.14%)    41260407.54 (   0.57%)
> Hmean     unixbench-dhry2reg-512	6255568261.91 (   0.00%)  6258326086.80 (   0.04%)  6252223940.32 (  -0.05%)  6259088809.43 (   0.06%)
> Amean     unixbench-syscall-1    	   2507165.37 (   0.00%)    2579108.77 *  -2.87%*      2488617.40 *   0.74%*     2517574.40 (  -0.42%)
> Amean     unixbench-syscall-512  	   7458476.50 (   0.00%)    7502528.67 *  -0.59%*      7978379.53 *  -6.97%*     7580369.27 *  -1.63%*
> Hmean     unixbench-pipe-1    		   2369301.21 (   0.00%)    2392905.29 *   1.00%*      2410432.93 *   1.74%*     2347814.20 (  -0.91%)
> Hmean     unixbench-pipe-512		 340299405.72 (   0.00%)  339139980.01 *  -0.34%*    340403992.95 (   0.03%)   338708678.82 *  -0.47%*
> Hmean     unixbench-spawn-1      	      5571.78 (   0.00%)       5423.03 (  -2.67%)         5462.82 (  -1.96%)        5543.08 (  -0.52%)
> Hmean     unixbench-spawn-512   	     63999.96 (   0.00%)      63485.41 (  -0.80%)        64730.98 *   1.14%*       67486.34 *   5.45%*
> Hmean     unixbench-execl-1       	      3587.15 (   0.00%)       3624.44 *   1.04%*         3638.74 *   1.44%*        3639.57 *   1.46%*
> Hmean     unixbench-execl-512    	     14184.17 (   0.00%)      13784.17 (  -2.82%)        13104.71 *  -7.61%*       13598.22 (  -4.13%)
> 
> ~~~~~~~~~~~~~~~~~~
> ~ DeathStarBench ~
> ~~~~~~~~~~~~~~~~~~
> 
> o NPS1
> CCD	Scaling	       tip    SIS_NODE 		SIS_NODE_LIMIT 		SIS_NODE_TOPOEXT
> 1	1		0%      0.30%  		 	0.83%  		 	0.79%
> 1	1		0%      0.17%  		 	2.53%  		 	0.91%
> 1	1		0%      -0.40% 		 	2.90%  		 	1.61%
> 1	1		0%      -7.95% 		 	1.19%  		 	-1.56%
> 
> o NPS2
> 
> CCD	Scaling	       tip    SIS_NODE 		SIS_NODE_LIMIT 		SIS_NODE_TOPOEXT
> 1	1		0%      0.34%  		 	-0.73% 		 	-0.62%
> 1	1		0%      -0.02% 		 	0.14%  		 	-1.15%
> 1	1		0%      -12.34%		 	-9.64% 		 	-7.80%
> 1	1		0%      -12.41%		 	-1.03% 		 	-9.85%
> 
> Note: In NPS2, 8 CCD case shows 10% run to run variation.
> 
> o NPS4
> 
> CCD	Scaling	       tip    SIS_NODE 		SIS_NODE_LIMIT 		SIS_NODE_TOPOEXT
> 1	1		0%      -1.32% 		 	-0.71% 		 	-1.09%
> 1	1		0%      -1.53% 		 	-1.11% 		 	-1.73%
> 1	1		0%      7.19%  		 	-3.47% 		 	5.75%
> 1	1		0%      -4.66% 		 	-1.91% 		 	-7.52%

LIMIT seems to do well for the NPS1 case, but how come it falls apart
for NPS2 ?!? that doesn't realy make sense, does it?

And again NPS4 is all over the place :/

> 
> --
> If you would like me to collect any more information during any of the
> above benchmark runs, please let me know.

dizzy with numbers ....

Perhaps see if you can figure out why NPS4 is so weird, there's only 2
CCXs to go around per node on that thing, the various results should not
be all over the map.

Perhaps pick hackbenc since it shows the problem and is easy and quick
to run?

Also, can you share the TOPOEXT code?



[Index of Archives]     [Linux Stable Commits]     [Linux Stable Kernel]     [Linux Kernel]     [Linux USB Devel]     [Linux Video &Media]     [Linux Audio Users]     [Yosemite News]     [Linux SCSI]

  Powered by Linux