On Fri, Aug 09, 2019 at 10:28:15AM -0700, sathyanarayanan kuppuswamy wrote: > On 8/9/19 3:28 AM, Lukas Wunner wrote: > > A sysfs request to enable or disable a PCIe hotplug slot should not > > return before it has been carried out. That is sought to be achieved > > by waiting until the controller's "pending_events" have been cleared. > > > > However the IRQ thread pciehp_ist() clears the "pending_events" before > > it acts on them. If pciehp_sysfs_enable_slot() / _disable_slot() happen > > to check the "pending_events" after they have been cleared but while > > pciehp_ist() is still running, the functions may return prematurely > > with an incorrect return value. > > Can this be fixed by changing the sequence of clearing the pending_events in > pciehp_ist() ? It can't. The processing logic is such that pciehp_ist() atomically removes bits from pending_events and acts upon them. Simultaneously, new events may be queued up by adding bits to pending_events (through a hardirq handled by pciehp_isr(), through a sysfs request, etc). Those will be handled in an additional iteration of pciehp_ist(). If I'd delay removing bits from pending_events, I then couldn't tell if new events have accumulated while others have been processed. E.g. a PDS event may occur while another one is being processed. The second PDS events may signify a card removal immediately after the card has been brought up. It's crucial not to lose the second PDS event but act properly on it by bringing the slot down again. This way of processing events also allows me to easily filter events. E.g. we tolerate link flaps occurring during the first 100 ms after enabling the slot simply by atomically removing bits from pending_events at a certain point. See commit 6c35a1ac3da6 ("PCI: pciehp: Tolerate initially unstable link"). Now what I *could* do would be to make the events currently being processed public, e.g. by adding an "atomic_t current_events" to struct controller. Then I could wait in pciehp_sysfs_enable_slot() / _disable_slot() until both "pending_events" and "current_events" becomes empty. But it would basically amount to the same as this patch, and we don't really need to know *which* events are being processed, only the *fact* that events are being processed. Let me know if you have further questions regarding the pciehp processing logic. Thanks, Lukas