[Bug 30712] Slow transitioning AMD ondemand CPU because of wrong sampling_rate

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



https://bugzilla.kernel.org/show_bug.cgi?id=30712





--- Comment #5 from justincase@xxxxxxxxxxx  2011-03-28 01:47:43 ---
(In reply to comment #4)
> Great, thanks!
> I know this runs for a while..., but could you let it run (over night?) with
> different sampling_rate values (this was default, 109ms?).
> Best: min, default and one or 2 in between (above was 

Yes sir!

Strangely, I got better results this time (for sampling_rate = 100900). I
changed kernel in the meanwhile from 2.6.37.x to 2.6.38. Don't know if that's
the reason. 

Also, I must say that the machine is not perfectly and absolutely quiet. For
example, every 5 minutes, about 45 rrdtool PNGs are being generated... 

Than I had to modify the benchmark source code a little because the kernel does
not keep the sampling_rate value in memory when the governor is changed. 

i.e. line 157 @ benchmark.c: 
 /* set the powersave governor which activates P-State switching
  * again */
 if (set_cpufreq_governor(config->governor, config->cpu) != 0)
     return;

 int slen = strlen(config->sampling_rate);
 if ( sysfs_write_file(0, "ondemand/sampling_rate", config->sampling_rate,
slen) != slen )
     return;

I hope this modification is effective immediately, because if the kernel waits
for the previous sampling cycle to finish before using the new value, that
might be a problem... 

But let's see the bench! 


sampling rate -> 10900

#round load sleep performance powersave percentage
0 50000 50000 50379 57822 87.128
1 100000 100000 96123 105471 91.136
2 150000 150000 157853 165049 95.640
3 200000 200000 195221 208529 93.618
4 250000 250000 240717 259260 92.848
5 300000 300000 295529 304502 97.053
6 350000 350000 351005 355713 98.676
7 400000 400000 404862 407786 99.283
8 450000 450000 455725 458553 99.383
9 500000 500000 506461 509685 99.367
10 550000 550000 552422 558971 98.828
11 600000 600000 599962 609144 98.493
12 650000 650000 653394 650651 100.422
13 700000 700000 742302 759985 97.673
14 750000 750000 743909 749063 99.312
15 800000 800000 839556 848876 98.902
16 850000 850000 836299 852711 98.075
17 900000 900000 878341 906728 96.869
18 950000 950000 958657 955908 100.288
19 1000000 1000000 996672 1008523 98.825
20 1050000 1050000 1115971 1128368 98.901
21 1100000 1100000 1066765 1096360 97.301
22 1150000 1150000 1208905 1248820 96.804
23 1200000 1200000 1201914 1207109 99.570
24 1250000 1250000 1252482 1291425 96.984
25 1300000 1300000 1311825 1310604 100.093
26 1350000 1350000 1356576 1346045 100.782
27 1400000 1400000 1396160 1414234 98.722
28 1450000 1450000 1460034 1456336 100.254
29 1500000 1500000 1473712 1505785 97.870
30 1550000 1550000 1541369 1544064 99.825
31 1600000 1600000 1595234 1563091 102.056
32 1650000 1650000 1737228 1768357 98.240
33 1700000 1700000 1688818 1674172 100.875
34 1750000 1750000 1728056 1730448 99.862
35 1800000 1800000 1749468 1788727 97.805
36 1850000 1850000 1830809 1821587 100.506
37 1900000 1900000 1968621 2001197 98.372
38 1950000 1950000 1956487 1959974 99.822
39 2000000 2000000 1989418 1991125 99.914


sampling rate -> 25000

#round load sleep performance powersave percentage
0 50000 50000 48866 63812 76.578
1 100000 100000 95406 103630 92.064
2 150000 150000 159273 170228 93.564
3 200000 200000 184444 209933 87.858
4 250000 250000 246598 254528 96.885
5 300000 300000 286551 291895 98.169
6 350000 350000 359588 378732 94.945
7 400000 400000 392532 402556 97.510
8 450000 450000 445735 449587 99.143
9 500000 500000 531572 535932 99.186
10 550000 550000 549339 597997 91.863
11 600000 600000 593279 604690 98.113
12 650000 650000 641693 635087 101.040
13 700000 700000 692703 673405 102.866
14 750000 750000 720045 719564 100.067
15 800000 800000 840078 798726 105.177
16 850000 850000 833332 922140 90.369
17 900000 900000 846550 858108 98.653
18 950000 950000 938212 1009257 92.961
19 1000000 1000000 965626 943793 102.313
20 1050000 1050000 992188 1043123 95.117
21 1100000 1100000 1079752 1053940 102.449
22 1150000 1150000 1112577 1066276 104.342
23 1200000 1200000 1178584 1186980 99.293
24 1250000 1250000 1219567 1209520 100.831
25 1300000 1300000 1275331 1295114 98.472
26 1350000 1350000 1290260 1271751 101.455
27 1400000 1400000 1317887 1355029 97.259
28 1450000 1450000 1392389 1417351 98.239
29 1500000 1500000 1450419 1470584 98.629
30 1550000 1550000 1494379 1527781 97.814
31 1600000 1600000 1578152 1574044 100.261
32 1650000 1650000 1564408 1613680 96.947
33 1700000 1700000 1641712 1658964 98.960
34 1750000 1750000 1704641 1703238 100.082
35 1800000 1800000 1854979 1908299 97.206
36 1850000 1850000 1930030 1965809 98.180
37 1900000 1900000 2006221 2002679 100.177
38 1950000 1950000 1991565 2066377 96.380
39 2000000 2000000 1937709 1963756 98.674


sampling rate -> 50000

#round load sleep performance powersave percentage
0 50000 50000 50461 57215 88.195
1 100000 100000 101266 108726 93.139
2 150000 150000 150922 161746 93.308
3 200000 200000 201366 220502 91.322
4 250000 250000 251854 270302 93.175
5 300000 300000 302107 324696 93.043
6 350000 350000 352833 377245 93.529
7 400000 400000 402723 430434 93.562
8 450000 450000 453385 475389 95.371
9 500000 500000 516310 523196 98.684
10 550000 550000 527734 541273 97.499
11 600000 600000 603932 623068 96.929
12 650000 650000 654171 672086 97.334
13 700000 700000 722544 721662 100.122
14 750000 750000 755333 780769 96.742
15 800000 800000 784639 771908 101.649
16 850000 850000 857070 866353 98.929
17 900000 900000 896301 921364 97.280
18 950000 950000 953981 977734 97.571
19 1000000 1000000 986557 945493 104.343
20 1050000 1050000 1066578 1082070 98.568
21 1100000 1100000 1118159 1137267 98.320
22 1150000 1150000 1171215 1181797 99.105
23 1200000 1200000 1248940 1218559 102.493
24 1250000 1250000 1274655 1293596 98.536
25 1300000 1300000 1354501 1327079 102.066
26 1350000 1350000 1350517 1379971 97.866
27 1400000 1400000 1403327 1425986 98.411
28 1450000 1450000 1490654 1501879 99.253
29 1500000 1500000 1516141 1532396 98.939
30 1550000 1550000 1603324 1576549 101.698
31 1600000 1600000 1624517 1720622 94.415
32 1650000 1650000 1700588 1712018 99.332
33 1700000 1700000 1709934 1732689 98.687
34 1750000 1750000 1765423 1822591 96.863
35 1800000 1800000 1814231 1839975 98.601
36 1850000 1850000 1888431 1920897 98.310
37 1900000 1900000 1918348 1942091 98.777
38 1950000 1950000 1990487 2016755 98.698
39 2000000 2000000 2029399 2055636 98.724


sampling rate -> 75000

#round load sleep performance powersave percentage
0 50000 50000 50667 84951 59.643
1 100000 100000 108964 131021 83.165
2 150000 150000 152263 171764 88.647
3 200000 200000 202738 231244 87.673
4 250000 250000 253296 281282 90.050
5 300000 300000 323773 349455 92.651
6 350000 350000 380587 409458 92.949
7 400000 400000 434598 459378 94.606
8 450000 450000 451185 462619 97.529
9 500000 500000 544721 565812 96.272
10 550000 550000 557344 584730 95.316
11 600000 600000 600266 629588 95.343
12 650000 650000 710363 735434 96.591
13 700000 700000 706661 740431 95.439
14 750000 750000 755650 772399 97.832
15 800000 800000 802048 842788 95.166
16 850000 850000 853532 882385 96.730
17 900000 900000 902382 923021 97.764
18 950000 950000 1025231 1058781 96.831
19 1000000 1000000 1084996 1114325 97.368
20 1050000 1050000 1054255 1085418 97.129
21 1100000 1100000 1147697 1168613 98.210
22 1150000 1150000 1237486 1255271 98.583
23 1200000 1200000 1189043 1246567 95.385
24 1250000 1250000 1248880 1282136 97.406
25 1300000 1300000 1303318 1323350 98.486
26 1350000 1350000 1360049 1375888 98.849
27 1400000 1400000 1394339 1424795 97.862
28 1450000 1450000 1423447 1499351 94.938
29 1500000 1500000 1626024 1664063 97.714
30 1550000 1550000 1568529 1585058 98.957
31 1600000 1600000 1613514 1651092 97.724
32 1650000 1650000 1665502 1684689 98.861
33 1700000 1700000 1745951 1789933 97.543
34 1750000 1750000 1755412 1809875 96.991
35 1800000 1800000 1818859 1853292 98.142
36 1850000 1850000 1962896 1996294 98.327
37 1900000 1900000 2015061 2053166 98.144
38 1950000 1950000 1958055 1991440 98.324
39 2000000 2000000 2059918 2077839 99.138


sampling rate -> 100900

#round load sleep performance powersave percentage
0 50000 50000 49871 84547 58.987
1 100000 100000 96912 116817 82.961
2 150000 150000 146427 170140 86.063
3 200000 200000 201665 222451 90.656
4 250000 250000 252238 279322 90.304
5 300000 300000 302541 329129 91.922
6 350000 350000 353094 385873 91.505
7 400000 400000 399234 437468 91.260
8 450000 450000 453888 491621 92.325
9 500000 500000 504262 539304 93.502
10 550000 550000 547418 593890 92.175
11 600000 600000 589834 628469 93.853
12 650000 650000 648893 698749 92.865
13 700000 700000 704968 756190 93.226
14 750000 750000 750982 792786 94.727
15 800000 800000 798755 842458 94.812
16 850000 850000 842453 872535 96.552
17 900000 900000 897170 934559 95.999
18 950000 950000 956476 980354 97.564
19 1000000 1000000 1001540 1026438 97.574
20 1050000 1050000 1037185 1068218 97.095
21 1100000 1100000 1108554 1134449 97.717
22 1150000 1150000 1157416 1185222 97.654
23 1200000 1200000 1181416 1208771 97.737
24 1250000 1250000 1255804 1281740 97.977
25 1300000 1300000 1307917 1329915 98.346
26 1350000 1350000 1313132 1365145 96.190
27 1400000 1400000 1409645 1422965 99.064
28 1450000 1450000 1460223 1456055 100.286
29 1500000 1500000 1510886 1522465 99.239
30 1550000 1550000 1554302 1566502 99.221
31 1600000 1600000 1577039 1618269 97.452
32 1650000 1650000 1644825 1684849 97.625
33 1700000 1700000 1668999 1729190 96.519
34 1750000 1750000 1747671 1793127 97.465
35 1800000 1800000 1785283 1828590 97.632
36 1850000 1850000 1863788 1885203 98.864
37 1900000 1900000 1887764 1940127 97.301
38 1950000 1950000 1955864 1980122 98.775
39 2000000 2000000 1971118 2033591 96.928


This last one is surprising because very different from the previous one. 

Strange enough so I ran the test I did last time again: 
 for i in {000..999} ; do dd if=/dev/zero of=file$i bs=1M count=1 ; done

To make it short, I got these average results (total exec time) : 
 performance : 4.6s
 ondemand (sr = 109000) : 8.2s
 ondemand (sr = 10900) : 5.8s

Decreasing sampling_rate is still good for performance, but the total time is
much shorter in every cases. Because of the new "RCU pathname lookup" from
2.6.38 maybe? For information, this has been done on a software RAI5 + ext4. 

> I agree that it would make sense to hardcode latency values in powernow-k8 at
> least for some families. Even latency is wrong then, it should get set in a way
> that ondemand takes best sampling rate values later.

Kind of an auto-adaptive sampling rate? 

What I did for Debian is tweek the init.d script coming from the cpufrequtils
package like this: if the sampling_rate is > 100000, set it to
sampling_rate_min (see
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=614256). 

Hope this helps! 

Fabien C.

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
--
To unsubscribe from this list: send the line "unsubscribe cpufreq" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Devel]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Forum]     [Linux SCSI]

  Powered by Linux