Dne 16. 08. 24 v 20:29 Rafael J. Wysocki napsal(a):
On Wed, Aug 14, 2024 at 8:48 AM Jiri Slaby <jirislaby@xxxxxxxxxx> wrote:
On 14. 08. 24, 7:22, Jiri Slaby wrote:
Hi,
one openSUSE's user reported that with 6.10, he sees one CPU under an
IRQ storm from ACPI (sci_interrupt):
9: 20220768 ... IR-IO-APIC 9-fasteoi acpi
At:
https://bugzilla.suse.com/show_bug.cgi?id=1229085
6.9 was OK.
With acpi.debug_level=0x08000000 acpi.debug_layer=0xffffffff, there is a
repeated load of:
evgpe-0673 ev_detect_gpe : Read registers for GPE 6D:
Status=20, Enable=00, RunEnable=4A, WakeEnable=00
0x6d seems to count excessively (10 snapshots every 1 second):
/sys/firmware/acpi/interrupts/gpe6D: 82066 EN STS enabled unmasked
/sys/firmware/acpi/interrupts/gpe6D: 86536 EN STS enabled unmasked
/sys/firmware/acpi/interrupts/gpe6D: 90990 STS enabled unmasked
/sys/firmware/acpi/interrupts/gpe6D: 95468 EN STS enabled unmasked
/sys/firmware/acpi/interrupts/gpe6D: 100282 EN STS enabled unmasked
/sys/firmware/acpi/interrupts/gpe6D: 105187 STS enabled unmasked
/sys/firmware/acpi/interrupts/gpe6D: 110014 STS enabled unmasked
/sys/firmware/acpi/interrupts/gpe6D: 114852 STS enabled unmasked
/sys/firmware/acpi/interrupts/gpe6D: 119682 STS enabled unmasked
/sys/firmware/acpi/interrupts/gpe6D: 124194 STS enabled unmasked
/sys/firmware/acpi/interrupts/gpe6D: 128641 EN STS enabled unmasked
acpidump:
https://bugzilla.suse.com/attachment.cgi?id=876677
DSDT:
https://bugzilla.suse.com/attachment.cgi?id=876678
Any ideas?
GPE 6D is listed in _PRW for some devices, so maybe one of them
continues to trigger wakeup events?
Disabling powertop service (which calls /usr/sbin/powertop --auto-tune)
solves problem completely. After some search I have found this is the cause:
# causes IRQ storm on 6.10.x
# kernel 6.9.9 is immune
echo 'auto' > /sys/bus/pci/devices/0000:00:1f.6/power/control
lspci | grep 1f.6
00:1f.6 Ethernet controller: Intel Corporation Device 550b (rev 20)
journalctl -b | grep 1f.6
srp 17 19:44:17 e14 kernel: pci 0000:00:1f.6: [8086:550b] type 00 class
0x020000 conventional PCI endpoint
srp 17 19:44:17 e14 kernel: pci 0000:00:1f.6: BAR 0 [mem
0x9c300000-0x9c31ffff]
srp 17 19:44:17 e14 kernel: pci 0000:00:1f.6: PME# supported from D0
D3hot D3cold
srp 17 19:44:17 e14 kernel: pci 0000:00:1f.6: Adding to iommu group 12
srp 17 19:44:19 e14 kernel: e1000e 0000:00:1f.6: Interrupt Throttling
Rate (ints/sec) set to dynamic conservative mode
srp 17 19:44:19 e14 kernel: e1000e 0000:00:1f.6 0000:00:1f.6
(uninitialized): registered PHC clock
srp 17 19:44:20 e14 kernel: e1000e 0000:00:1f.6 eth0: (PCI
Express:2.5GT/s:Width x1) fc:5c:ee:b0:13:74
srp 17 19:44:20 e14 kernel: e1000e 0000:00:1f.6 eth0: Intel(R) PRO/1000
Network Connection
srp 17 19:44:20 e14 kernel: e1000e 0000:00:1f.6 eth0: MAC: 16, PHY: 12,
PBA No: FFFFFF-0FF
srp 17 19:44:20 e14 kernel: e1000e 0000:00:1f.6 enp0s31f6: renamed from eth0
srp 17 19:44:24 e14 ModemManager[1434]: <info> [base-manager] couldn't
check support for device '/sys/devices/pci0000:00/0000:00:1f.6': not
supported by any plugin
You can ask the reporter to mask that GPE via "echo mask >
/sys/firmware/acpi/interrupts/gpe6D" and see if the storm goes away
then.
The only ACPI core issue introduced between 6.9 and 6.10 I'm aware of
is the one addressed by this series
https://lore.kernel.org/linux-acpi/22385894.EfDdHjke4D@xxxxxxxxxxxxx/
but this is about the EC and the problem here doesn't appear to be
EC-related. It may be worth trying anyway, though.