On Wed, Jun 24, 2020 at 10:01:17AM +0200, Willy Wolff wrote: > Hi Krzysztof, > Thanks to look at it. > > mem_gov is /sys/class/devfreq/10c20000.memory-controller/governor > > Here some numbers after increasing the running time: > > Running using simple_ondemand: > Before: > From : To > : 165000000 206000000 275000000 413000000 543000000 633000000 728000000 825000000 time(ms) > * 165000000: 0 0 0 0 0 0 0 4 4528600 > 206000000: 5 0 0 0 0 0 0 0 57780 > 275000000: 0 5 0 0 0 0 0 0 50060 > 413000000: 0 0 5 0 0 0 0 0 46240 > 543000000: 0 0 0 5 0 0 0 0 48970 > 633000000: 0 0 0 0 5 0 0 0 47330 > 728000000: 0 0 0 0 0 0 0 0 0 > 825000000: 0 0 0 0 0 5 0 0 331300 > Total transition : 34 > > > After: > From : To > : 165000000 206000000 275000000 413000000 543000000 633000000 728000000 825000000 time(ms) > * 165000000: 0 0 0 0 0 0 0 4 5098890 > 206000000: 5 0 0 0 0 0 0 0 57780 > 275000000: 0 5 0 0 0 0 0 0 50060 > 413000000: 0 0 5 0 0 0 0 0 46240 > 543000000: 0 0 0 5 0 0 0 0 48970 > 633000000: 0 0 0 0 5 0 0 0 47330 > 728000000: 0 0 0 0 0 0 0 0 0 > 825000000: 0 0 0 0 0 5 0 0 331300 > Total transition : 34 > > With a running time of: > LITTLE => 283.699 s (680.877 c per mem access) > big => 284.47 s (975.327 c per mem access) I see there were no transitions during your memory test. > > And when I set to the performance governor: > Before: > From : To > : 165000000 206000000 275000000 413000000 543000000 633000000 728000000 825000000 time(ms) > 165000000: 0 0 0 0 0 0 0 5 5099040 > 206000000: 5 0 0 0 0 0 0 0 57780 > 275000000: 0 5 0 0 0 0 0 0 50060 > 413000000: 0 0 5 0 0 0 0 0 46240 > 543000000: 0 0 0 5 0 0 0 0 48970 > 633000000: 0 0 0 0 5 0 0 0 47330 > 728000000: 0 0 0 0 0 0 0 0 0 > * 825000000: 0 0 0 0 0 5 0 0 331350 > Total transition : 35 > > After: > From : To > : 165000000 206000000 275000000 413000000 543000000 633000000 728000000 825000000 time(ms) > 165000000: 0 0 0 0 0 0 0 5 5099040 > 206000000: 5 0 0 0 0 0 0 0 57780 > 275000000: 0 5 0 0 0 0 0 0 50060 > 413000000: 0 0 5 0 0 0 0 0 46240 > 543000000: 0 0 0 5 0 0 0 0 48970 > 633000000: 0 0 0 0 5 0 0 0 47330 > 728000000: 0 0 0 0 0 0 0 0 0 > * 825000000: 0 0 0 0 0 5 0 0 472980 > Total transition : 35 > > With a running time of: > LITTLE: 68.8428 s (165.223 c per mem access) > big: 71.3268 s (244.549 c per mem access) > > > I see some transition, but not occuring during the benchmark. > I haven't dive into the code, but maybe it is the heuristic behind that is not > well defined? If you know how it's working that would be helpfull before I dive > in it. Sorry, don't know that much. It seems it counts time between overflow of DMC perf events and based on this bumps up the frequency. Maybe your test does not fit well in current formula? Maybe the formula has some drawbacks... > > I run your test as well, and indeed, it seems to work for large bunch of memory, > and there is some delay before making a transition (seems to be around 10s). > When you kill memtester, it reduces the freq stepwisely every ~10s. > > Note that the timing shown above account for the critical path, and the code is > looping on reading only, there is no write in the critical path. > Maybe memtester is doing writes and devfreq heuristic uses only write info? > You mentioned that you want to cut the prefetcher to have direct access to RAM. But prefetcher also accesses the RAM. He does not get the contents from the air. Although this is unrelated to the problem because your pattern should kick ondemand as well. Best regards, Krzysztof