Hi Eduardo, The merge window for 3.14 is now open and I'm wondering if you had a chance to look at these numbers? Thanks, Zoran On 30 December 2013 12:48, Zoran Markovic <zoran.markovic@xxxxxxxxxx> wrote: > Eduardo, > >>> What is the workload you're running besides the proprietary heater code? > I re-did experiments from Linaro's site pointed by Amit while > profiling _cpu_down() and _cpu_up() times: >>> [1] https://wiki.linaro.org/WorkingGroups/PowerManagement/Archives/Hotplug > > I am attaching a spreadsheet with some results and graphs: > > Sheet 1 (thermal_ramp) has three plots. Topmost is an unbound thermal > ramp that levels off at ~48C. Middle plot is a thermal ramp with cpu > hotplug kicking in as a cooling device at 38C. Bottom plot is a > thermal ramp with cpu hotplug kicking in at 38C and cpufreq kicking in > at 40C. One interesting thing to note is that the middle plot slowly > drifts towards 40C even though cooling is set to 38C. I attribute this > to the logic of step-wise governor combined with polling mode: if > temperature is dropping above trip point, cooling is reduced. Adding > another cooling device at 40C as a back-stop seems to keep temperature > in check. In all cases running code was ARM's max_power test that > maximizes CPU usage, as evidenced by results of 'top': > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 33 root 20 0 0 0 0 R 100.0 0.0 45:46.43 thread1 > 32 root 20 0 0 0 0 R 91.4 0.0 44:48.14 thread0 > 1344 root 20 0 0 0 0 R 8.6 0.0 0:03.64 kworker/u4:1 > 1380 root 20 0 2476 996 712 R 0.3 0.1 0:00.07 top > > Sheet 2 (idle) has two plots. Top one represents latency of > _cpu_down() while gradually adding instances of cyclictest process, > from 0 to 10; 20 samples were captured in each case. Bottom one > represents latency of _cpu_up() in the same test. Other than running > cyclictest, the system was mostly idle. > > Sheet 3 (max_power) repeated the same test as in sheet 2, but it was > running ARM's max_power test in the background. > > A quick look at the latency graphs shows that loading the system > causes a stochastic - but not deterministic - component added to > latencies. Minimum latency times appear unchanged. > >> - Homogeneous dual core Cortex-A9 environment. >> - They go up to 48C when fully loaded. Can you explain where is your >> sensor location? Gradient to hotspot, etc? 48C at A9s or board temperature? > Thermal sensor is located at L2 cache, with gradient to sensor likely > smaller than sensor inaccuracy. > >> - This code looks promising on embedded dual core system. However, it >> does not necessarily mean it works fine on, say server side. How about a >> system with 8/16/32 cores? How about a more heterogeneous workload? Not >> to talk about heterogeneous cores. I think in more complicated scenarios >> the data you provided above might even change. The difference between >> your minimum and maximum shutdown/startup times are quite considerable, >> so I am assuming your variance is not negligible, imaging if we scale >> this up, what happens? > Agreed that this is difficult to characterize across all platform > types. Maybe other list members could comment the behaviour on their > platforms? Passing in a cpu mask defines CPUs that contribute to > cooling of a single zone, so there is some flexibility in defining > cooling strategy. Hopefully this is good enough for a start... > >> >> - The other point is that this type of cooling device must be taken in >> very sensible way. Shutting down circuitry may not be the best strategy >> for thermal. In fact, if you think about it, given you have a workload >> well balanced between, say, two cores, as same of your environment, >> turning one off it means you need to deal the very same load in only one >> CPU. In other words, turning of circuitry means, from thermal standpoint >> that you are increasing you heat/area ratio. Sometimes, you actually >> want to increase this ratio in order to properly cool down your system. > In this particular test case since both CPUs are fully loaded, > temperature is reduced at the expense of parallelism (i.e. execution > time), so overall heat/area is still reduced. If particular areas are > heat-sensitive, then it makes sense to define a separate thermal zone > (and sensor) for each of them. Just a thought. > > Looking forward to further discussion. > > Regards, > Zoran -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html