Re: [PATCH v4.14, v4.19, v5.4, v5.10, v5.15] igb: free up irq resources in device shutdown path.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Mar 13, 2024 at 02:07:13AM +1100, Imran Khan wrote:
> [ Upstream commit 9fb9eb4b59acc607e978288c96ac7efa917153d4 ]

No it is not.

> 
> systems, using igb driver, crash while executing poweroff command
> as per following call stack:
> 
> crash> bt -a
> PID: 62583    TASK: ffff97ebbf28dc40  CPU: 0    COMMAND: "poweroff"
>  #0 [ffffa7adcd64f8a0] machine_kexec at ffffffffa606c7c1
>  #1 [ffffa7adcd64f900] __crash_kexec at ffffffffa613bb52
>  #2 [ffffa7adcd64f9d0] panic at ffffffffa6099c45
>  #3 [ffffa7adcd64fa50] oops_end at ffffffffa603359a
>  #4 [ffffa7adcd64fa78] die at ffffffffa6033c32
>  #5 [ffffa7adcd64faa8] do_trap at ffffffffa60309a0
>  #6 [ffffa7adcd64faf8] do_error_trap at ffffffffa60311e7
>  #7 [ffffa7adcd64fbc0] do_invalid_op at ffffffffa6031320
>  #8 [ffffa7adcd64fbd0] invalid_op at ffffffffa6a01f2a
>     [exception RIP: free_msi_irqs+408]
>     RIP: ffffffffa645d248  RSP: ffffa7adcd64fc88  RFLAGS: 00010286
>     RAX: ffff97eb1396fe00  RBX: 0000000000000000  RCX: ffff97eb1396fe00
>     RDX: ffff97eb1396fe00  RSI: 0000000000000000  RDI: 0000000000000000
>     RBP: ffffa7adcd64fcb0   R8: 0000000000000002   R9: 000000000000fbff
>     R10: 0000000000000000  R11: 0000000000000000  R12: ffff98c047af4720
>     R13: ffff97eb87cd32a0  R14: ffff97eb87cd3000  R15: ffffa7adcd64fd57
>     ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
>  #9 [ffffa7adcd64fc80] free_msi_irqs at ffffffffa645d0fc
>  #10 [ffffa7adcd64fcb8] pci_disable_msix at ffffffffa645d896
>  #11 [ffffa7adcd64fce0] igb_reset_interrupt_capability at ffffffffc024f335 [igb]
>  #12 [ffffa7adcd64fd08] __igb_shutdown at ffffffffc0258ed7 [igb]
>  #13 [ffffa7adcd64fd48] igb_shutdown at ffffffffc025908b [igb]
>  #14 [ffffa7adcd64fd70] pci_device_shutdown at ffffffffa6441e3a
>  #15 [ffffa7adcd64fd98] device_shutdown at ffffffffa6570260
>  #16 [ffffa7adcd64fdc8] kernel_power_off at ffffffffa60c0725
>  #17 [ffffa7adcd64fdd8] SYSC_reboot at ffffffffa60c08f1
>  #18 [ffffa7adcd64ff18] sys_reboot at ffffffffa60c09ee
>  #19 [ffffa7adcd64ff28] do_syscall_64 at ffffffffa6003ca9
>  #20 [ffffa7adcd64ff50] entry_SYSCALL_64_after_hwframe at ffffffffa6a001b1
> 
> This happens because igb_shutdown has not yet freed up allocated irqs and
> free_msi_irqs finds irq_has_action true for involved msi irqs here and this
> condition triggers BUG_ON.
> 
> Freeing irqs before proceeding further in igb_clear_interrupt_scheme,
> fixes this problem.
> 
> Signed-off-by: Imran Khan <imran.f.khan@xxxxxxxxxx>
> ---
> 
> This issue does not happen in v5.17 or later kernel versions because
> 'commit 9fb9eb4b59ac ("PCI/MSI: Let core code free MSI descriptors")',
> explicitly frees up MSI based irqs and hence indirectly fixes this issue
> as well. Also this is why I have mentioned this commit as equivalent
> upstream commit. But this upstream change itself is dependent on a bunch
> of changes starting from 'commit 288c81ce4be7 ("PCI/MSI: Move code into a 
> separate directory")', which refactored msi driver into multiple parts.
> So another way of fixing this issue would be to backport these patches and
> get this issue implictly fixed.
> Kindly let me know if my current patch is not acceptable and in that case
> will it be fine if I backport the above mentioned msi driver refactoring
> patches to LST.

What would the real patch series look like?  How bad is the backports?
Try that out first please.

thanks,

greg k-h




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux