On 2020-06-24-10-14-38, Krzysztof Kozlowski wrote: > On Wed, Jun 24, 2020 at 10:01:17AM +0200, Willy Wolff wrote: > > Hi Krzysztof, > > Thanks to look at it. > > > > mem_gov is /sys/class/devfreq/10c20000.memory-controller/governor > > > > Here some numbers after increasing the running time: > > > > Running using simple_ondemand: > > Before: > > From : To > > : 165000000 206000000 275000000 413000000 543000000 633000000 728000000 825000000 time(ms) > > * 165000000: 0 0 0 0 0 0 0 4 4528600 > > 206000000: 5 0 0 0 0 0 0 0 57780 > > 275000000: 0 5 0 0 0 0 0 0 50060 > > 413000000: 0 0 5 0 0 0 0 0 46240 > > 543000000: 0 0 0 5 0 0 0 0 48970 > > 633000000: 0 0 0 0 5 0 0 0 47330 > > 728000000: 0 0 0 0 0 0 0 0 0 > > 825000000: 0 0 0 0 0 5 0 0 331300 > > Total transition : 34 > > > > > > After: > > From : To > > : 165000000 206000000 275000000 413000000 543000000 633000000 728000000 825000000 time(ms) > > * 165000000: 0 0 0 0 0 0 0 4 5098890 > > 206000000: 5 0 0 0 0 0 0 0 57780 > > 275000000: 0 5 0 0 0 0 0 0 50060 > > 413000000: 0 0 5 0 0 0 0 0 46240 > > 543000000: 0 0 0 5 0 0 0 0 48970 > > 633000000: 0 0 0 0 5 0 0 0 47330 > > 728000000: 0 0 0 0 0 0 0 0 0 > > 825000000: 0 0 0 0 0 5 0 0 331300 > > Total transition : 34 > > > > With a running time of: > > LITTLE => 283.699 s (680.877 c per mem access) > > big => 284.47 s (975.327 c per mem access) > > I see there were no transitions during your memory test. > > > > > And when I set to the performance governor: > > Before: > > From : To > > : 165000000 206000000 275000000 413000000 543000000 633000000 728000000 825000000 time(ms) > > 165000000: 0 0 0 0 0 0 0 5 5099040 > > 206000000: 5 0 0 0 0 0 0 0 57780 > > 275000000: 0 5 0 0 0 0 0 0 50060 > > 413000000: 0 0 5 0 0 0 0 0 46240 > > 543000000: 0 0 0 5 0 0 0 0 48970 > > 633000000: 0 0 0 0 5 0 0 0 47330 > > 728000000: 0 0 0 0 0 0 0 0 0 > > * 825000000: 0 0 0 0 0 5 0 0 331350 > > Total transition : 35 > > > > After: > > From : To > > : 165000000 206000000 275000000 413000000 543000000 633000000 728000000 825000000 time(ms) > > 165000000: 0 0 0 0 0 0 0 5 5099040 > > 206000000: 5 0 0 0 0 0 0 0 57780 > > 275000000: 0 5 0 0 0 0 0 0 50060 > > 413000000: 0 0 5 0 0 0 0 0 46240 > > 543000000: 0 0 0 5 0 0 0 0 48970 > > 633000000: 0 0 0 0 5 0 0 0 47330 > > 728000000: 0 0 0 0 0 0 0 0 0 > > * 825000000: 0 0 0 0 0 5 0 0 472980 > > Total transition : 35 > > > > With a running time of: > > LITTLE: 68.8428 s (165.223 c per mem access) > > big: 71.3268 s (244.549 c per mem access) > > > > > > I see some transition, but not occuring during the benchmark. > > I haven't dive into the code, but maybe it is the heuristic behind that is not > > well defined? If you know how it's working that would be helpfull before I dive > > in it. > > Sorry, don't know that much. It seems it counts time between overflow of > DMC perf events and based on this bumps up the frequency. > > Maybe your test does not fit well in current formula? Maybe the formula > has some drawbacks... OK, I will read the code then. > > > > > I run your test as well, and indeed, it seems to work for large bunch of memory, > > and there is some delay before making a transition (seems to be around 10s). > > When you kill memtester, it reduces the freq stepwisely every ~10s. > > > > Note that the timing shown above account for the critical path, and the code is > > looping on reading only, there is no write in the critical path. > > Maybe memtester is doing writes and devfreq heuristic uses only write info? > > > You mentioned that you want to cut the prefetcher to have direct access > to RAM. But prefetcher also accesses the RAM. He does not get the > contents from the air. Although this is unrelated to the problem > because your pattern should kick ondemand as well. Yes obvisouly. I was just describing a bit the microbenchmark and the memory pattern access. I was suggesting that a random pattern will break the effectiveness of the prefetcher, and as such we have a worst case situation on the memory bus. > > Best regards, > Krzysztof