On Wed, 2012-10-24 at 01:40 +0100, Thomas Renninger wrote: > > More and more of people are getting interested in the subject of power > > (energy) consumption monitoring. We have some external tools like > > "battery simulators", energy probes etc., but some targets can measure > > their power usage on their own. > > > > Traditionally such data should be exposed to the user via hwmon sysfs > > interface, and that's exactly what I did for "my" platform - I have > > a /sys/class/hwmon/hwmon*/device/energy*_input and this was good > > enough to draw pretty graphs in userspace. Everyone was happy... > > > > Now I am getting new requests to do more with this data. In particular > > I'm asked how to add such information to ftrace/perf output. > Why? What is the gain? > > Perf events can be triggered at any point in the kernel. > A cpufreq event is triggered when the frequency gets changed. > CPU idle events are triggered when the kernel requests to enter an idle state > or exits one. > > When would you trigger a thermal or a power event? > There is the possibility of (critical) thermal limits. > But if I understand this correctly you want this for debugging and > I guess you have everything interesting one can do with temperature > values: > - read the temperature > - draw some nice graphs from the results > > Hm, I guess I know what you want to do: > In your temperature/energy graph, you want to have some dots > when relevant HW states (frequency, sleep states, DDR power,...) > changed. Then you are able to see the effects over a timeline. > > So you have to bring the existing frequency/idle perf events together > with temperature readings > > Cleanest solution could be to enhance the exisiting userspace apps > (pytimechart/perf timechart) and let them add another line > (temperature/energy), but the data would not come from perf, but > from sysfs/hwmon. > Not sure whether this works out with the timechart tools. > Anyway, this sounds like a userspace only problem. Ok, so it is actually what I'm working on right now. Not with the standard perf tool (there are other users of that API ;-) but indeed I'm trying to "enrich" the data stream coming from kernel with user-space originating values. I am a little bit concerned about effect of extra syscalls (accessing the value and gettimeofday to generate a timestamp) at a higher sampling rates, but most likely it won't be a problem. Can report once I know more, if this is of interest to anyone. Anyway, there are at least two debug/trace related use cases that can not be satisfied that way (of course one could argue about their usefulness): 1. ftrace-over-network (https://lwn.net/Articles/410200/) which is particularly appealing for "embedded users", where there's virtually no useful userspace available (think Android). Here a (functional) trace event is embedded into a normal trace and available "for free" at the host side. 2. perf groups - the general idea is that one event (let it be cycle counter interrupt or even a timer) triggers read of other values (eg. cache counter or - in this case - energy counter). The aim is to have a regular "snapshots" of the system state. I'm not sure if the standard perf tool can do this, but I do :-) And last, but not least, there are the non-debug/trace clients for energy data as discussed in other mails in this thread. Of course the trace event won't really satisfy their needs either. Thanks for your feedback! Paweł _______________________________________________ lm-sensors mailing list lm-sensors@xxxxxxxxxxxxxx http://lists.lm-sensors.org/mailman/listinfo/lm-sensors