On 7/3/2018 9:59 AM, Lukas Wunner wrote: > On Tue, Jul 03, 2018 at 09:31:24AM -0400, Sinan Kaya wrote: >> Issue is observing hotplug link down event in the middle of AER recovery >> as in my previous reply. >> >> If we mask hotplug interrupts before secondary bus reset via my patch, >> then hotplug driver will not observe both link up and link down interrupts. >> >> If we don't mask hotplug interrupts, we have a race condition. > > I assume that a bus reset not only causes a link and presence event but > also clears the Presence Detect State bit in the Slot Status register > and the Data Link Layer Link Active bit in the Link Status register > momentarily. > > pciehp may access those two bits concurrently to the AER driver > performing a slot reset. So it may not be sufficient to mask > the interrupt. > > I've posted this patch to address the issue: > https://patchwork.ozlabs.org/patch/930391/ Very interesting! I missed this completely. I know for a fact that bus reset clears the Data Link Layer Active bit as soon as link goes down. It gets set again following link up. Presence detect depends on the HW implementation. QDT root ports don't change presence detect for instance since nobody actually removed the card. If an implementation supports in-band presence detect, the answer is yes. As soon as the link goes down, presence detect bit will get cleared until recovery. It sounds like we need to update your lock change with my proposal. lock() in mask_irq() and unlock() in unmask_irq() > > Thanks, > > Lukas > -- Sinan Kaya Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.