On 10/29/22 05:33, Marc SCHAEFER wrote:
Hello,
I am using the apu2 embedded platform, which uses an amd64 AMD GX-412TC SOC,
stepping : 1
microcode : 0x7030105
With Debian bullseye, the power measurement when idle is very big, and wrong (>
80 .. 100 W). We have observed this behaviour on multiple systems.
The problem did not occur with Debian buster, does not occur with the
temperature sensor, and the power measurement goes back to apparently correct
values when the system is no longer idle.
It does not seem to be linked to amd64 specific firmwares.
The problem lies in the /sys/class/hwmon/hwmon0/power1_average not in the
lm-sensors package (direct reading the /sys files gives the same isue).
So it appears to be within the kernel: 4.19.0-22-amd64 seems ok and
5.10.0-18-amd64 is not.
Funnily, there does not seem to be relevant changes in the specific kernel
driver (fam15h_power).
Any idea what could lead to this strange behaviour?
A few, but they are all more or less unlikely.
- Debian might carry some non-upstream driver patches causing the problem
(or fixing it in the older kernel, and the patch was not applied to the
new kernel).
- Debian installs its own version of the CPU firmware, and the version
installed with the newer kernel introduces the problem.
Normally the BIOS would update the CPU firmware, but that may not be
the case for older systems.
- The problem is caused by some change in the kernel outside the
fam15h_power driver. I can not imagine what that might be, but it is
a possibility.
You should be able to check the first two possibilities. For the last one,
the only means I could think of would be to bisect between the good and
the bad version.
Guenter
Thank you for any ideas or pointers.
Examples:
When bullseye is idle, it's completely wrong (' are from me):
cat /sys/class/hwmon/hwmon0/power1_average
94'019'396
When bullseye has 100% CPU used (one core):
cat /sys/class/hwmon/hwmon0/power1_average
10'917'309
The only visible change is that hwmon1 and hwmon0 are interchanged:
bullseye:
fam15h_power-pci-00c4
Adapter: PCI adapter
power1: 88.61 W (interval = 0.01 s, crit = 6.00 W)
k10temp-pci-00c3
Adapter: PCI adapter
temp1: +54.5 C (high = +70.0 C)
(crit = +105.0 C, h94019396yst = +104.0 C)
buster:
k10temp-pci-00c3
Adapter: PCI adapter
temp1: +59.6°C (high = +70.0°C)
(crit = +105.0°C, hyst = +104.0°C)
fam15h_power-pci-00c4
Adapter: PCI adapter
power1: 8.00 W (interval = 0.01 s, crit = 6.00 W)