Hi Thomas, (resent, because I sent it by accident first only to tglx) first: thanks for the answer. > That's a really good question. So one hint is that it's only a few boards > and not a really widespread problem. The other is that it goes away when > the kernel runs with only one CPU. Hmm, I noticed that "nr_cpus=1" on the kernel command line greatly reduced the frequency. But it was not a complete cure. > It looks like an issue when tearing down the serial interrupt, but I need > to think more about how to debug that. Can we have a problem in the APCI tables? One thing that I saw in the kernel logs is that Linux is using the MADT normally, but with "noapic=1" it uses Intel MP Spec 1.4 ?!?! If the ACPI BIOS describes the interrupts in a wrong way, then maybe we have an issue here (and not in the Haswell architecture, as I originally thought). For the following output ... - i used kernel v4.14-rc2-64-g464d12309e1b (so the patch creating the trouble is in) - I booted with "acpi=debug" - I did not use "nr_cpus=1" or "noapic=1" - I run "stress-ng --fstat 0" for some seconds and aborted it then - I opened and kept ttyS1 open to keep it's IRQ in use ("cat </dev/ttyS1") root@test:~# dmesg | egrep -i 'irq|apic' [ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz root=/dev/sda1 rootwait ro quiet norealroot apic=debug [ 0.000000] ACPI: APIC 0x00000000DB8CC548 000062 (v03 DLoG Terminal 01072009 AMI 00010013) [ 0.000000] ACPI: Local APIC address 0xfee00000 [ 0.000000] mapped APIC to ffffffffff5fd000 ( fee00000) [ 0.000000] ACPI: Local APIC address 0xfee00000 [ 0.000000] ACPI: LAPIC_NMI (acpi_id[0xff] high edge lint[0x1]) [ 0.000000] IOAPIC[0]: apic_id 8, version 32, address 0xfec00000, GSI 0-39 [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) [ 0.000000] Int: type 0, pol 0, trig 0, bus 00, IRQ 00, APIC ID 8, APIC INT 02 [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level) [ 0.000000] Int: type 0, pol 1, trig 3, bus 00, IRQ 09, APIC ID 8, APIC INT 09 [ 0.000000] ACPI: IRQ0 used by override. [ 0.000000] Int: type 0, pol 0, trig 0, bus 00, IRQ 01, APIC ID 8, APIC INT 01 [ 0.000000] Int: type 0, pol 0, trig 0, bus 00, IRQ 03, APIC ID 8, APIC INT 03 [ 0.000000] Int: type 0, pol 0, trig 0, bus 00, IRQ 04, APIC ID 8, APIC INT 04 [ 0.000000] Int: type 0, pol 0, trig 0, bus 00, IRQ 05, APIC ID 8, APIC INT 05 [ 0.000000] Int: type 0, pol 0, trig 0, bus 00, IRQ 06, APIC ID 8, APIC INT 06 [ 0.000000] Int: type 0, pol 0, trig 0, bus 00, IRQ 07, APIC ID 8, APIC INT 07 [ 0.000000] Int: type 0, pol 0, trig 0, bus 00, IRQ 08, APIC ID 8, APIC INT 08 [ 0.000000] ACPI: IRQ9 used by override. [ 0.000000] Int: type 0, pol 0, trig 0, bus 00, IRQ 0a, APIC ID 8, APIC INT 0a [ 0.000000] Int: type 0, pol 0, trig 0, bus 00, IRQ 0b, APIC ID 8, APIC INT 0b [ 0.000000] Int: type 0, pol 0, trig 0, bus 00, IRQ 0c, APIC ID 8, APIC INT 0c [ 0.000000] Int: type 0, pol 0, trig 0, bus 00, IRQ 0d, APIC ID 8, APIC INT 0d [ 0.000000] Int: type 0, pol 0, trig 0, bus 00, IRQ 0e, APIC ID 8, APIC INT 0e [ 0.000000] Int: type 0, pol 0, trig 0, bus 00, IRQ 0f, APIC ID 8, APIC INT 0f [ 0.000000] mapped IOAPIC to ffffffffff5fc000 (fec00000) [ 0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz root=/dev/sda1 rootwait ro quiet norealroot apic=debug [ 0.000000] NR_IRQS: 4352, nr_irqs: 512, preallocated irqs: 16 [ 0.000000] APIC: Switch to symmectic I/O mode setup [ 0.000000] ENABLING IO-APIC IRQs [ 0.000000] init IO_APIC IRQs [ 0.000000] apic 8 pin 0 not connected [ 0.000000] IOAPIC[0]: Set routing entry (8-1 -> 0xef -> IRQ 1 Mode:0 Active:0 Dest:1) [ 0.000000] IOAPIC[0]: Set routing entry (8-2 -> 0x30 -> IRQ 0 Mode:0 Active:0 Dest:1) [ 0.000000] IOAPIC[0]: Set routing entry (8-3 -> 0xef -> IRQ 3 Mode:0 Active:0 Dest:1) [ 0.000000] IOAPIC[0]: Set routing entry (8-4 -> 0xef -> IRQ 4 Mode:0 Active:0 Dest:1) [ 0.000000] IOAPIC[0]: Set routing entry (8-5 -> 0xef -> IRQ 5 Mode:0 Active:0 Dest:1) [ 0.000000] IOAPIC[0]: Set routing entry (8-6 -> 0xef -> IRQ 6 Mode:0 Active:0 Dest:1) [ 0.000000] IOAPIC[0]: Set routing entry (8-7 -> 0xef -> IRQ 7 Mode:0 Active:0 Dest:1) [ 0.000000] IOAPIC[0]: Set routing entry (8-8 -> 0xef -> IRQ 8 Mode:0 Active:0 Dest:1) [ 0.000000] IOAPIC[0]: Set routing entry (8-9 -> 0xef -> IRQ 9 Mode:1 Active:0 Dest:1) [ 0.000000] IOAPIC[0]: Set routing entry (8-10 -> 0xef -> IRQ 10 Mode:0 Active:0 Dest:1) [ 0.000000] IOAPIC[0]: Set routing entry (8-11 -> 0xef -> IRQ 11 Mode:0 Active:0 Dest:1) [ 0.000000] IOAPIC[0]: Set routing entry (8-12 -> 0xef -> IRQ 12 Mode:0 Active:0 Dest:1) [ 0.000000] IOAPIC[0]: Set routing entry (8-13 -> 0xef -> IRQ 13 Mode:0 Active:0 Dest:1) [ 0.000000] IOAPIC[0]: Set routing entry (8-14 -> 0xef -> IRQ 14 Mode:0 Active:0 Dest:1) [ 0.000000] IOAPIC[0]: Set routing entry (8-15 -> 0xef -> IRQ 15 Mode:0 Active:0 Dest:1) [ 0.000000] apic 8 pin 16 not connected [ 0.000000] apic 8 pin 17 not connected [ 0.000000] apic 8 pin 18 not connected [ 0.000000] apic 8 pin 19 not connected [ 0.000000] apic 8 pin 20 not connected [ 0.000000] apic 8 pin 21 not connected [ 0.000000] apic 8 pin 22 not connected [ 0.000000] apic 8 pin 23 not connected [ 0.000000] apic 8 pin 24 not connected [ 0.000000] apic 8 pin 25 not connected [ 0.000000] apic 8 pin 26 not connected [ 0.000000] apic 8 pin 27 not connected [ 0.000000] apic 8 pin 28 not connected [ 0.000000] apic 8 pin 29 not connected [ 0.000000] apic 8 pin 30 not connected [ 0.000000] apic 8 pin 31 not connected [ 0.000000] apic 8 pin 32 not connected [ 0.000000] apic 8 pin 33 not connected [ 0.000000] apic 8 pin 34 not connected [ 0.000000] apic 8 pin 35 not connected [ 0.000000] apic 8 pin 36 not connected [ 0.000000] apic 8 pin 37 not connected [ 0.000000] apic 8 pin 38 not connected [ 0.000000] apic 8 pin 39 not connected [ 0.000000] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=0 pin2=0 [ 0.622083] ACPI: Using IOAPIC for interrupt routing [ 0.648320] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 10 *11 12 14 15) [ 0.648388] ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 10 11 12 14 15) *0, disabled. [ 0.648453] ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 *10 11 12 14 15) [ 0.648518] ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 *10 11 12 14 15) [ 0.648581] ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 *5 6 10 11 12 14 15) [ 0.648645] ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 6 10 11 12 14 15) *0, disabled. [ 0.648709] ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 *6 10 11 12 14 15) [ 0.648772] ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 6 10 *11 12 14 15) [ 0.649387] PCI: Using ACPI for IRQ routing [ 0.658288] IOAPIC[0]: Set routing entry (8-16 -> 0xef -> IRQ 16 Mode:1 Active:1 Dest:1) [ 0.658507] IOAPIC[0]: Set routing entry (8-23 -> 0xef -> IRQ 23 Mode:1 Active:1 Dest:1) [ 0.682305] IOAPIC[0]: Set routing entry (8-16 -> 0xef -> IRQ 16 Mode:1 Active:1 Dest:1) [ 0.682482] pcieport 0000:00:1c.0: Signaling PME with IRQ 40 [ 0.682662] intel_idle: lapic_timer_reliable_states 0xffffffff [ 0.684128] IOAPIC[0]: Set routing entry (8-19 -> 0xef -> IRQ 19 Mode:1 Active:1 Dest:1) [ 0.695347] ata1: SATA max UDMA/133 abar m2048@0xf7e36000 port 0xf7e36100 irq 41 [ 0.695532] IOAPIC[0]: Set routing entry (8-23 -> 0xef -> IRQ 23 Mode:1 Active:1 Dest:1) [ 0.699492] ehci-pci 0000:00:1d.0: irq 23, io mem 0xf7e37000 [ 0.719124] serio: i8042 KBD port at 0x60,0x64 irq 1 [ 0.719130] serio: i8042 AUX port at 0x60,0x64 irq 12 [ 0.719868] rtc_cmos 00:02: alarms up to one month, y3k, 242 bytes nvram, hpet irqs [ 0.720040] IOAPIC[0]: Set routing entry (8-18 -> 0xef -> IRQ 18 Mode:1 Active:1 Dest:1) [ 0.723397] ... APIC ID: 00000000 (0) [ 0.723398] ... APIC VERSION: 01060015 [ 0.723431] number of MP IRQ sources: 15. [ 0.723433] number of IO-APIC #8 registers: 40. [ 0.723434] testing the IO APIC....................... [ 0.723443] IO APIC #8...... [ 0.723446] ....... : physical APIC id: 08 [ 0.723454] ....... : IO APIC version: 20 [ 0.723458] .... IRQ redirection table: [ 0.723459] IOAPIC 0: [ 0.723713] IRQ to pin mappings: [ 0.723715] IRQ0 -> 0:2 [ 0.723719] IRQ1 -> 0:1 [ 0.723722] IRQ3 -> 0:3 [ 0.723725] IRQ4 -> 0:4 [ 0.723728] IRQ5 -> 0:5 [ 0.723731] IRQ6 -> 0:6 [ 0.723734] IRQ7 -> 0:7 [ 0.723737] IRQ8 -> 0:8 [ 0.723740] IRQ9 -> 0:9 [ 0.723743] IRQ10 -> 0:10 [ 0.723746] IRQ11 -> 0:11 [ 0.723749] IRQ12 -> 0:12 [ 0.723752] IRQ13 -> 0:13 [ 0.723755] IRQ14 -> 0:14 [ 0.723758] IRQ15 -> 0:15 [ 0.723761] IRQ16 -> 0:16 [ 0.723764] IRQ18 -> 0:18 [ 0.723767] IRQ19 -> 0:19 [ 0.723771] IRQ23 -> 0:23 [ 1.602762] parport0: PC-style at 0x378, irq 7 [PCSPP,TRISTATE] [ 1.606526] Serial: 8250/16550 driver, 2 ports, IRQ sharing enabled [ 1.721691] 00:06: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 16550A [ 1.724602] IOAPIC[0]: Set routing entry (8-22 -> 0xef -> IRQ 22 Mode:1 Active:1 Dest:1) [ 1.727079] IOAPIC[0]: Set routing entry (8-20 -> 0xef -> IRQ 20 Mode:1 Active:1 Dest:1) [ 1.845056] 00:07: ttyS1 at I/O 0x2f8 (irq = 3, base_baud = 115200) is a 16550A [ 131.849916] do_IRQ: 0.46 No irq handler for vector [ 132.210010] do_IRQ: 0.46 No irq handler for vector [ 132.330033] do_IRQ: 0.46 No irq handler for vector [ 132.570025] do_IRQ: 0.46 No irq handler for vector root@test:~# cd /sys/kernel/debug/irq/domains/ root@test:/sys/kernel/debug/irq/domains# grep . * default:name: VECTOR default: size: 0 default: mapped: 27 default: flags: 0x00000041 default:Online bitmaps: 2 default:Global available: 391 default:Global reserved: 12 default:Total allocated: 15 default:System: 41: 0-19,32,50,128,238-255 default: | CPU | avl | man | act | vectors default: 0 188 0 15 33-46,48 default: 1 203 0 0 IO-APIC-0:name: IO-APIC-0 IO-APIC-0: size: 40 IO-APIC-0: mapped: 21 IO-APIC-0: flags: 0x00000041 IO-APIC-0: parent: VECTOR IO-APIC-0: name: VECTOR IO-APIC-0: size: 0 IO-APIC-0: mapped: 27 IO-APIC-0: flags: 0x00000041 IO-APIC-0:Online bitmaps: 2 IO-APIC-0:Global available: 391 IO-APIC-0:Global reserved: 12 IO-APIC-0:Total allocated: 15 IO-APIC-0:System: 41: 0-19,32,50,128,238-255 IO-APIC-0: | CPU | avl | man | act | vectors IO-APIC-0: 0 188 0 15 33-46,48 IO-APIC-0: 1 203 0 0 PCI-HT:name: PCI-HT PCI-HT: size: 0 PCI-HT: mapped: 0 PCI-HT: flags: 0x00000041 PCI-HT: parent: VECTOR PCI-HT: name: VECTOR PCI-HT: size: 0 PCI-HT: mapped: 27 PCI-HT: flags: 0x00000041 PCI-HT:Online bitmaps: 2 PCI-HT:Global available: 391 PCI-HT:Global reserved: 12 PCI-HT:Total allocated: 15 PCI-HT:System: 41: 0-19,32,50,128,238-255 PCI-HT: | CPU | avl | man | act | vectors PCI-HT: 0 188 0 15 33-46,48 PCI-HT: 1 203 0 0 PCI-MSI-2:name: PCI-MSI-2 PCI-MSI-2: size: 0 PCI-MSI-2: mapped: 6 PCI-MSI-2: flags: 0x00000051 PCI-MSI-2: parent: VECTOR PCI-MSI-2: name: VECTOR PCI-MSI-2: size: 0 PCI-MSI-2: mapped: 27 PCI-MSI-2: flags: 0x00000041 PCI-MSI-2:Online bitmaps: 2 PCI-MSI-2:Global available: 391 PCI-MSI-2:Global reserved: 12 PCI-MSI-2:Total allocated: 15 PCI-MSI-2:System: 41: 0-19,32,50,128,238-255 PCI-MSI-2: | CPU | avl | man | act | vectors PCI-MSI-2: 0 188 0 15 33-46,48 PCI-MSI-2: 1 203 0 0 \_SB_.PCI0.SBUS:name: \_SB_.PCI0.SBUS \_SB_.PCI0.SBUS: size: 120 \_SB_.PCI0.SBUS: mapped: 0 \_SB_.PCI0.SBUS: flags: 0x00000040 VECTOR:name: VECTOR VECTOR: size: 0 VECTOR: mapped: 27 VECTOR: flags: 0x00000041 VECTOR:Online bitmaps: 2 VECTOR:Global available: 391 VECTOR:Global reserved: 12 VECTOR:Total allocated: 15 VECTOR:System: 41: 0-19,32,50,128,238-255 VECTOR: | CPU | avl | man | act | vectors VECTOR: 0 188 0 15 33-46,48 VECTOR: 1 203 0 0 root@test:/sys/kernel/debug/irq/domains# cd ../irqs/ root@test:/sys/kernel/debug/irq/irqs# grep 46 * 3: Vector: 46 root@test:/sys/kernel/debug/irq/irqs# cat 3 handler: handle_edge_irq device: (null) status: 0x00000000 istate: 0x00000000 ddepth: 0 wdepth: 0 dstate: 0x01400200 IRQD_ACTIVATED IRQD_IRQ_STARTED IRQD_SINGLE_TARGET node: 0 affinity: 0-1 effectiv: 0 pending: domain: IO-APIC-0 hwirq: 0x3 chip: IO-APIC flags: 0x10 IRQCHIP_SKIP_SET_WAKE parent: domain: VECTOR hwirq: 0x3 chip: APIC flags: 0x0 Vector: 46 Target: 0 root@test:/sys/kernel/debug/irq/irqs# cat /proc/interrupts CPU0 CPU1 0: 11 0 IO-APIC 2-edge timer 1: 2 0 IO-APIC 1-edge i8042 3: 46 0 IO-APIC 3-edge ttyS1 4: 37 0 IO-APIC 4-edge 7: 0 0 IO-APIC 7-edge parport0 8: 1 0 IO-APIC 8-edge rtc0 9: 4 0 IO-APIC 9-fasteoi acpi 12: 4 0 IO-APIC 12-edge i8042 18: 0 0 IO-APIC 18-fasteoi i801_smbus 23: 33 0 IO-APIC 23-fasteoi ehci_hcd:usb1 40: 0 0 PCI-MSI 458752-edge PCIe PME, pciehp 41: 3452 0 PCI-MSI 512000-edge ahci[0000:00:1f.2] 42: 103 0 PCI-MSI 327680-edge xhci_hcd 43: 324 0 PCI-MSI 442368-edge snd_hda_intel:card0 44: 1388 0 PCI-MSI 409600-edge eth0 45: 97 0 PCI-MSI 32768-edge i915 NMI: 0 0 Non-maskable interrupts LOC: 2194 3127 Local timer interrupts SPU: 0 0 Spurious interrupts PMI: 0 0 Performance monitoring interrupts IWI: 0 0 IRQ work interrupts RTR: 1 0 APIC ICR read retries RES: 454 464 Rescheduling interrupts CAL: 41 1376 Function call interrupts TLB: 0 0 TLB shootdowns TRM: 0 0 Thermal event interrupts THR: 0 0 Threshold APIC interrupts MCE: 0 0 Machine check exceptions MCP: 1 2 Machine check polls ERR: 0 MIS: 0 PIN: 0 0 Posted-interrupt notification event NPI: 0 0 Nested posted-interrupt event PIW: 0 0 Posted-interrupt wakeup event Regarding the Haswell: in an Intel document I found the heading "Modile 4th Generation Intel Core Processor Family I/O, External design specificaton EDS" and below a table. This table said the following properties are true for IRQ 3+4: - SERIRQ: yes, - Direct from Pin: no - Using PCI Message: yes I can understand the first two: the IRQs originate from a Winbond SuperIO that is attached via LPC. But the last item confuses me: if the IRQ is delivered via MSI, shouldn't I then see PCI-MSI in /proc/interrupts??? Or should the vector then be in a different domain, i.E. not in IO-APIC-0 but in PCI-* ???? Greetings, Holger