On Wed, 27 May 2020, Naveen Krishna Ch wrote: > These registers are 32bit counters, they might wrap-around quite faster at > high work loads. So, we used a kernel thread to accumulate the values of > each core and socket to 64bit values. > > Depending on when the module is inserted in the system, the initial values > of the counters could be different and we do not have a way to know, how > many time the registers are wrapped around in the past. I understand that. If you anticipate that the module may be inserted after a wraparound, the driver should populate 'prev_value' with actual counter values instead of zeros. That way the driver will properly accumulate energy over time it's been inserted. As implemented, the driver counts energy since boot time, minus unknown amount lost to wraparounds if the driver was loaded too late. In my case I observed the contradictory readings over a period of several seconds where no wraparound was possible. > In our evaluation, the sum of the energy consumption of cores of a socket was > always less (actually far lesser) than the socket energy consumption. Did you try on laptop CPUs (Renoir SoC, Ryzen 4x00U marketing name)? You also might need specific workloads to observe the issue, I first found it with a small hand-written test, and then found a bigger discrepancy with AVX test from https://github.com/travisdowns/avx-turbo Alexander