Hi Alexander On Wed, 27 May 2020 at 12:29, Alexander Monakov <amonakov@xxxxxxxxx> wrote: > > On Wed, 27 May 2020, Naveen Krishna Ch wrote: > > > These registers are 32bit counters, they might wrap-around quite faster at > > high work loads. So, we used a kernel thread to accumulate the values of > > each core and socket to 64bit values. > > > > Depending on when the module is inserted in the system, the initial values > > of the counters could be different and we do not have a way to know, how > > many time the registers are wrapped around in the past. > > I understand that. If you anticipate that the module may be inserted after a > wraparound, the driver should populate 'prev_value' with actual counter > values instead of zeros. That way the driver will properly accumulate > energy over time it's been inserted. As implemented, the driver counts > energy since boot time, minus unknown amount lost to wraparounds if the > driver was loaded too late. No problem if this module is built into the kernel. If this module is inserted at later point, unless the user keeps the counters since the boot and provide it as an input during the module insert (we can implement this). There is no other way to provide the lost count. > In my case I observed the contradictory readings over a period of several > seconds where no wraparound was possible. > > > In our evaluation, the sum of the energy consumption of cores of a socket was > > always less (actually far lesser) than the socket energy consumption. > > Did you try on laptop CPUs (Renoir SoC, Ryzen 4x00U marketing name)? You also > might need specific workloads to observe the issue, I first found it with a > small hand-written test, and then found a bigger discrepancy with AVX test > from https://github.com/travisdowns/avx-turbo I tried on an octa core machine, must be Renoir vendor_id : AuthenticAMD cpu family : 23 model : 96 On an idle system over a period of 500secs: At t=500sec | At t= 0 | Diff of energy | in Joules | Power in Watts core 0 | 650186538 | 649712585 | 473953 | 0.473953 J | 0.000947906 W core 1 | 507792434 | 507131301 | 661133 | 0.661133 J | 0.001322266 W core 2 | 455706497 | 455163970 | 542527 | 0.542527 J | 0.001085054 W core 3 | 392240356 | 391740417 | 499939 | 0.499939 J | 0.000999878 W core 4 | 411461654 | 410687881 | 773773 | 0.773773 J | 0.001547546 W core 5 | 288821884 | 288071395 | 750489 | 0.750489 J | 0.001500978 W core 6 | 186975219 | 186250793 | 724426 | 0.724426 J | 0.001448852 W core 7 | 131509216 | 130458816 | 1050400 | 1.0504 J | 0.0021008 W Socket 0 | 31638431930 | 29370733505 | 2267698425 | 2267.698 J | 4.53539685 W Power consumed by socket: 4.53539685 W Sum of power consumed by cores: 0.010953W On an system with AVX test running over a period of 500 secs: At t=500sec | At t= 0 | Diff of energy | in Joules | Power in Watts core 0 | 649348495 | 413687530 | 235660965 | 235.660965 | 0.47132193 core 1 | 506880081 | 294882827 | 211997254 | 211.997254 | 0.423994508 core 2 | 454804046 | 271046127 | 183757919 | 183.757919 | 0.367515838 core 3 | 391508712 | 237531021 | 153977691 | 153.977691 | 0.307955382 core 4 | 410336868 | 284410079 | 125926789 | 125.926789 | 0.251853578 core 5 | 287569732 | 192306015 | 95263717 | 95.263717 | 0.190527434 core 6 | 185909622 | 120556152 | 65353470 | 65.35347 | 0.13070694 core 7 | 129932006 | 95385940 | 34546066 | 34.546066 | 0.069092132 Socket 0 | 28399099655 | 24799819931 3599279724 | 3599.27972 | 7.198559448 Power consumed by socket: 7.198559448 W Sum of power consumed by cores: 2.212968 W Can you confirm this. > > Alexander -- Shine bright, (: Nav :)