Re: "No irq handler for vector" problem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> Though SERIRQ is a really good hint. Let me look into the details of that a
> bit more.

Maybe I have more hints to you:

- nolapic=1 also helps
- preventing i915.ko from being loaded also helps

I checked this on 4.18.17.


Why did we came at the i915?  After running stress-ng for some seconds
(or any program that opens/closes /dev/ttyS*) I got this:

root@test:~# dmesg | grep do_IRQ
[   15.627047] do_IRQ: 1.39 No irq handler for vector
[   15.747112] do_IRQ: 1.39 No irq handler for vector
[   15.747228] do_IRQ: 1.39 No irq handler for vector
[   15.867514] do_IRQ: 1.39 No irq handler for vector
[   15.926858] do_IRQ: 1.39 No irq handler for vector
[   15.926987] do_IRQ: 1.39 No irq handler for vector
[   16.047095] do_IRQ: 1.39 No irq handler for vector
[   16.047209] do_IRQ: 1.39 No irq handler for vector
[   16.167521] do_IRQ: 1.39 No irq handler for vector

Then I looked at this vector. I made sure the serial ports are both open
with a cat+redirection.

root@test:/sys/kernel/debug/irq/irqs# grep 39 *
3:     Vector:    39

I looked at this file and with my limited knowledge I couldn't find
anything wrong there:

root@test:/sys/kernel/debug/irq/irqs# cat 3
handler:  handle_edge_irq
device:   (null)
status:   0x00000000
istate:   0x00000000
ddepth:   0
wdepth:   0
dstate:   0x05400200
            IRQD_ACTIVATED
            IRQD_IRQ_STARTED
            IRQD_SINGLE_TARGET
            IRQD_CAN_RESERVE
node:     0
affinity: 0-1
effectiv: 1
pending:  
domain:  IO-APIC-0
 hwirq:   0x3
 chip:    IO-APIC
  flags:   0x10
             IRQCHIP_SKIP_SET_WAKE
 parent:
    domain:  VECTOR
     hwirq:   0x3
     chip:    APIC
      flags:   0x0
     Vector:    39
     Target:     1
     move_in_progress: 0
     is_managed:       0
     can_reserve:      1
     has_reserved:     0
     cleanup_pending:  0

Then I thought "Maybe there's an off-by-one error somewhere". I checked
at other vectors. No one used vector 40. But maybe vector 38?

root@test:/sys/kernel/debug/irq/irqs# grep 38 *
44:     Vector:    38
45:     Vector:    38

The output of those two files is:

root@test:/sys/kernel/debug/irq/irqs# cat 44 
handler:  handle_edge_irq
device:   0000:00:1b.0
status:   0x00000000
istate:   0x00000000
ddepth:   0
wdepth:   0
dstate:   0x01400200
            IRQD_ACTIVATED
            IRQD_IRQ_STARTED
            IRQD_SINGLE_TARGET
node:     -1
affinity: 0-1
effectiv: 1
pending:  
domain:  PCI-MSI-2
 hwirq:   0x6c000
 chip:    PCI-MSI
  flags:   0x10
             IRQCHIP_SKIP_SET_WAKE
 parent:
    domain:  VECTOR
     hwirq:   0x2c
     chip:    APIC
      flags:   0x0
     Vector:    38
     Target:     1
     move_in_progress: 0
     is_managed:       0
     can_reserve:      0
     has_reserved:     0
     cleanup_pending:  0
root@test:/sys/kernel/debug/irq/irqs# cat 45 
handler:  handle_edge_irq
device:   0000:00:02.0
status:   0x00000000
istate:   0x00000000
ddepth:   0
wdepth:   0
dstate:   0x01400200
            IRQD_ACTIVATED
            IRQD_IRQ_STARTED
            IRQD_SINGLE_TARGET
node:     -1
affinity: 0-1
effectiv: 0
pending:  
domain:  PCI-MSI-2
 hwirq:   0x8000
 chip:    PCI-MSI
  flags:   0x10
             IRQCHIP_SKIP_SET_WAKE
 parent:
    domain:  VECTOR
     hwirq:   0x2d
     chip:    APIC
      flags:   0x0
     Vector:    38
     Target:     0
     move_in_progress: 0
     is_managed:       0
     can_reserve:      0
     has_reserved:     0
     cleanup_pending:  0



According to lspci ...

root@test:/sys/kernel/debug/irq/irqs# lspci -nn
00:00.0 Host bridge [0600]: Intel Corporation Haswell-ULT DRAM Controller [8086:0a04] (rev 0b)
00:02.0 VGA compatible controller [0300]: Intel Corporation Haswell-ULT Integrated Graphics Controller [8086:0a06] (rev 0b)
00:14.0 USB controller [0c03]: Intel Corporation 8 Series USB xHCI HC [8086:9c31] (rev 04)
00:16.0 Communication controller [0780]: Intel Corporation 8 Series HECI #0 [8086:9c3a] (rev 04)
00:19.0 Ethernet controller [0200]: Intel Corporation Ethernet Connection I218-LM [8086:155a] (rev 04)
00:1b.0 Audio device [0403]: Intel Corporation 8 Series HD Audio Controller [8086:9c20] (rev 04)
00:1c.0 PCI bridge [0604]: Intel Corporation 8 Series PCI Express Root Port 1 [8086:9c10] (rev e4)
00:1d.0 USB controller [0c03]: Intel Corporation 8 Series USB EHCI #1 [8086:9c26] (rev 04)
00:1f.0 ISA bridge [0601]: Intel Corporation 8 Series LPC Controller [8086:9c43] (rev 04)
00:1f.2 SATA controller [0106]: Intel Corporation 8 Series SATA Controller 1 [AHCI mode] [8086:9c03] (rev 04)
00:1f.3 SMBus [0c05]: Intel Corporation 8 Series SMBus Controller [8086:9c22] (rev 04)

vector 44 is audio and vector 45 is i915. I started with removing i915
first. After a reboot the behavior of the system changed: the "no vector
for"-message was gone.

I added i915 again, rebooted, and the error was back.



root@test:/sys/kernel/debug/irq/irqs# cat /proc/interrupts 
           CPU0       CPU1       
  0:          8          0   IO-APIC   2-edge      timer
  1:          0          5   IO-APIC   1-edge      i8042
  3:          0         39   IO-APIC   3-edge      ttyS1
  4:          0         39   IO-APIC   4-edge    
  8:          1          0   IO-APIC   8-edge      rtc0
  9:          0          4   IO-APIC   9-fasteoi   acpi
 12:          7          0   IO-APIC  12-edge      i8042
 18:          0          0   IO-APIC  18-fasteoi   i801_smbus
 23:         33          0   IO-APIC  23-fasteoi   ehci_hcd:usb1
 40:          0          0   PCI-MSI 458752-edge      PCIe PME, pciehp
 41:          0       3618   PCI-MSI 512000-edge      ahci[0000:00:1f.2]
 42:          0        140   PCI-MSI 327680-edge      xhci_hcd
 43:       1711          0   PCI-MSI 409600-edge      eth0
 44:          0        343   PCI-MSI 442368-edge      snd_hda_intel:card0
 45:        119          0   PCI-MSI 32768-edge      i915
NMI:          0          0   Non-maskable interrupts
LOC:       4467       5445   Local timer interrupts
SPU:          0          0   Spurious interrupts
PMI:          0          0   Performance monitoring interrupts
IWI:          1          1   IRQ work interrupts
RTR:          0          0   APIC ICR read retries
RES:        620        511   Rescheduling interrupts
CAL:       1189        308   Function call interrupts
TLB:         13          8   TLB shootdowns
TRM:          0          0   Thermal event interrupts
THR:          0          0   Threshold APIC interrupts
MCE:          0          0   Machine check exceptions
MCP:          2          3   Machine check polls
ERR:          0
MIS:          0
PIN:          0          0   Posted-interrupt notification event
NPI:          0          0   Nested posted-interrupt event
PIW:          0          0   Posted-interrupt wakeup event





PS: I'm okay with testing this on 4.20-rc1 or any git tree you throw at me...



[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux PPP]     [Linux FS]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Linmodem]     [Device Mapper]     [Linux Kernel for ARM]

  Powered by Linux