On Sat, 15 Mar 2025, Lukas Wunner wrote: > On Thu, Mar 13, 2025 at 04:23:30PM +0200, Ilpo Järvinen wrote: > > pciehp_reset_slot() disables PDCE (Presence Detect Changed Enable) and > > DLLSCE (Data Link Layer State Changed Enable) for the duration of reset > > and clears the related status bits PDC and DLLSC from the Slot Status > > register after the reset to avoid hotplug incorrectly assuming the card > > was removed. > > > > However, hotplug shares interrupt with PME and BW notifications both of > > which can make pciehp_isr() to run despite PDCE and DLLSCE bits being > > off. pciehp_isr() then picks PDC or DLLSC bits from the Slot Status > > register due to the events that occur during reset and caches them into > > ->pending_events. Later, the IRQ thread in pciehp_ist() will process > > the ->pending_events and will assume the Link went Down due to a card > > change (in pciehp_handle_presence_or_link_change()). > > > > Change pciehp_reset_slot() to also clear HPIE (Hot-Plug Interrupt > > Enable) as pciehp_isr() will first check HPIE to see if the interrupt > > is not for it. Then synchronize with the IRQ handling to ensure no > > events are pending, before invoking the reset. > > After dwelling on this for a while, I'm thinking that it may re-introduce > the issue fixed by commit f5eff5591b8f ("PCI: pciehp: Fix AB-BA deadlock > between reset_lock and device_lock"): > > Looking at the second and third stack trace in its commit message, > down_write(reset_lock) in pciehp_reset_slot() is basically equivalent > to synchronize_irq() and we're holding device_lock() at that point, > hindering progress of pciehp_ist(). This description was somewhat confusing but what I can see, now that you mentioned this, is that if pciehp_reset_slot() calls synchronize_irq(), it can result in trying to acquire device_lock() again while trying to drain the pending events. ->reset_lock seems irrelevant to that problem. Thus, pciehp_reset_slot() cannot ever rely on completing the processing of all pending events before it invokes the reset as long as any of its callers is holding device_lock(). It's a bit sad, because removing most of the reset_lock complexity would have been nice simplification in locking, effectively it would have reverted f5eff5591b8f too. > So I think I have guided you in the wrong direction and I apologize > for that. > > However it seems to me that this should be solvable with the small > patch below. Am I missing something? > > @Joel Mathew Thomas, could you give the below patch a spin and see > if it helps? > > Thanks! > > -- >8 -- > > diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c > index bb5a8d9f03ad..99a2ac13a3d1 100644 > --- a/drivers/pci/hotplug/pciehp_hpc.c > +++ b/drivers/pci/hotplug/pciehp_hpc.c > @@ -688,6 +688,11 @@ static irqreturn_t pciehp_isr(int irq, void *dev_id) > return IRQ_HANDLED; > } > > + /* Ignore events masked by pciehp_reset_slot(). */ > + events &= ctrl->slot_ctrl; > + if (!events) > + return IRQ_HANDLED; > + > /* Save pending events for consumption by IRQ thread. */ > atomic_or(events, &ctrl->pending_events); > return IRQ_WAKE_THREAD; Yes, this should work, I think. I'm not entirely sure though how reading ->slot_ctrl here synchronizes wrt. pciehp_reset_slot() invoking reset. What guarantees pciehp_isr() sees the updated ->slot_ctrl when pciehp_reset_slot() has proceeded to invoke the reset? (I'm in general very hesitant about lockless and barrierless reader being race free, I might be just paranoid about it.) -- i.