Re: [PATCH v4 1/3] hwmon: Add amd_energy driver to report energy counters

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 4/27/20 5:39 AM, Naveen Krishna Ch wrote:
> Hi Guenter,
> 
> On Sat, 25 Apr 2020 at 21:47, Guenter Roeck <linux@xxxxxxxxxxxx> wrote:
>>
>> On Fri, Apr 24, 2020 at 08:50:54AM +0530, Naveen Krishna Chatradhi wrote:
>>> This patch adds hwmon based amd_energy driver support for
>>> family 17h processors from AMD.
>>>
>>> The driver provides following interface to the userspace
>>> 1. Reports the per core consumption
>>>       * file: "energy%d_input", label: "Ecore%03d"
>>> 2. Reports per socket energy consumption
>>>       * file: "energy%d_input", label: "Esocket%d"
>>> 2. Reports scaled energy value in Joules.
>>>
>>> Cc: Guenter Roeck <linux@xxxxxxxxxxxx>
>>> Signed-off-by: Naveen Krishna Chatradhi <nchatrad@xxxxxxx>
>>
>> Couple of additional comments below.
>>
>> On a higher level, I noticed that the data overflows quickly
>> (ie within a day or so), which is the reason why the matching
>> code for Intel CPUs never made it into the kernel. With that
>> in mind, I do wonder if this is really valuable. I am quite
>> concerned about people complaining that the data is useless,
>> and I have to say that I find it quite useless myself. Any
>> system running for more than a few hours will report more or
>> less random data. Any thoughts on that ?
> This driver will also address the future platforms with
> support for 64-bit counters.
> 
> We do have platforms that will come out very shortly, which will
> support a different energy resolution to increase the wraparound
> time with 32bit counters,
> 
> On a platform with 2 sockets each with 64 cores,
> Assuming 240W, new resolution of 15.6 milli Joules
> 
> -  Wrap around time for socket energy counter will be
> (up to ~3 to 4 days with maximum load).
> 
> 2^32*15.625e-3/240 = 279620.2667 secs to wrap (i.e 3.2 days)
> 
> - Wrap around time for the per-core energy counters with the
> new resolution will be
> 
> 2^32*15.6e-3/ (240 * 128) = 2184533.33 secs to wrap (i.e 25 days)
> 
> When a work load is to be run on certain predefined cores.
> The Work load managers can gather the energy status before starting
> and after completion of the job and measure power consumed by the
> work load.
> 
All that doesn't help much for existing platforms, nor for users who
expect that counters don't wrap around at all (at least not until they
reach the 64-bit limit).

I see two options. Either provide power reporting instead (which should
be done in the k10temp driver), or implement a kernel thread which runs
often enough to catch wraparounds. While not perfect (it will only report
the energy since the driver was loaded), it will at least avoid the
frequent wraparounds seen today, and that caveat can be documented.

>>
>> How about making the power reporting registers available and
>> reporting per-CPU power consumption instead ? I think that would
>> add much more value.
> We will expose power reporting when platform exposes that information.
> Until then, energy reporting becomes further critical.
> 

Some Windows tools such as HwInfo report power readings today,
on existing CPUs, so I don't really buy the claim that existing
chips don't report it.

Thanks,
Guenter



[Index of Archives]     [LM Sensors]     [Linux Sound]     [ALSA Users]     [ALSA Devel]     [Linux Audio Users]     [Linux Media]     [Kernel]     [Gimp]     [Yosemite News]     [Linux Media]

  Powered by Linux