Re: [PATCH 1/3 v7] hwmon: Add amd_energy driver to report energy counters

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 5/27/20 8:12 AM, Alexander Monakov wrote:
> On Wed, 27 May 2020, Guenter Roeck wrote:
> 
>>> I'm not exactly complaining, I'm proposing a solution: at probe time, populate
>>> prev_value members with MSR values instead of zeros. That way, the module will
>>> correctly count energy over the time it's been loaded. It can be unloaded and
>>> reloaded freely, and doing so would allow to easily measure energy across large
>>> spans of time, which sounds like an improvement.
>>>
>> That would ignore energy accumulated from before the driver was loaded, and
>> would thus trigger another set of complaints.
> 
> That doesn't sound right. There's no way for the driver to be sure that the
> counters did not wrap around before it was loaded. Here's a few scenarios
> how such wraparound is possible:
> 
> - while the user was messing in the bootloader for a few minutes
> - if the user kexec'd a new kernel
> - if the counters were not reset during a warm reboot
> 
> Ignoring initial values of the counters is generally the right thing to do.
> In the specific circumstances when the user wants to measure energy used
> since machine power-up, and they know the boot happened so quick the counters
> did not wrap around, they can easily script that with e.g. the rdmsr tool.
> Or perhaps the driver could pr_info the initial values at probe time.
> 
> Have such complaints already appeared in practice?
>
The main argument in the coretemp case, if I recall correctly, was that
the driver _doesn't_ provide valid data from the beginning of time, and
would thus be worthless. So, yes, such complaints have already appeared
in practice.

> Also note that documentation doesn't promise that counters start from zero
> at power-up time, although that's of course a natural assumption.
> 
> 
>> A slight improvement might be to add up core energy counters when loading
>> the driver, compare it against the package counter, and pick the larger
>> value for the initial package counter(s). This would at least ensure that
>> the package counter is never less than the sum of the core counters.
> 
> No, fudging the initial reading like this wouldn't help, because I was
> pointing out how core counters increment quicker than the package counter;
> i.e. even if the kernel fudged the initial values, they would still grow
> contradictory quick enough (on some workloads).
> 

This exchange is exactly what I was concerned about when this driver
was first submitted. I should have known better, and I should not
have accepted it. Right now I seriously wonder if I should revert/drop
it. Any arguments/thoughts why I _shouldn't_ do that ?

Guenter



[Index of Archives]     [LM Sensors]     [Linux Sound]     [ALSA Users]     [ALSA Devel]     [Linux Audio Users]     [Linux Media]     [Kernel]     [Gimp]     [Yosemite News]     [Linux Media]

  Powered by Linux