On Sat, Mar 15, 2025 at 07:57:55PM +0100, proxy0@xxxxxxxxxxxx wrote: > Mar 15, 2025, 9:28 PM by lukas@xxxxxxxxx: > > After dwelling on this for a while, I'm thinking that it may re-introduce > > the issue fixed by commit f5eff5591b8f ("PCI: pciehp: Fix AB-BA deadlock > > between reset_lock and device_lock"): > > > > Looking at the second and third stack trace in its commit message, > > down_write(reset_lock) in pciehp_reset_slot() is basically equivalent > > to synchronize_irq() and we're holding device_lock() at that point, > > hindering progress of pciehp_ist(). > > > > So I think I have guided you in the wrong direction and I apologize > > for that. > > > > However it seems to me that this should be solvable with the small > > patch below. Am I missing something? > > > > @Joel Mathew Thomas, could you give the below patch a spin and see > > if it helps? > > I've tested the patch series along with the additional patch provided. > > Kernel: 6.14.0-rc6-00043-g3571e8b091f4-dirty-pci-hotplug-reset-fixes-eventmask-fix > > Patches applied: > - [PATCH 1/4] PCI/hotplug: Disable HPIE over reset > - [PATCH 2/4] PCI/hotplug: Clearing HPIE for the duration of reset is enough > - [PATCH 3/4] PCI/hotplug: reset_lock is not required synchronizing with irq thread > - [PATCH 4/4] PCI/hotplug: Don't enable HPIE in poll mode > - The latest patch from you: > + /* Ignore events masked by pciehp_reset_slot(). */ > + events &= ctrl->slot_ctrl; > + if (!events) > + return IRQ_HANDLED; Could you test *only* the quoted diff, i.e. without patches [1/4] - [4/4], on top of a recent kernel? Sorry for not having been clear about this. I believe that patch [1/4] will re-introduce a deadlock we've already fixed two years ago, so the small diff above seeks to replace it with a simpler approach that will hopefully avoid the issue as well. Thanks, Lukas