On Tue, Nov 26, 2019 at 5:37 PM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote: > > On Mon, Nov 25, 2019 at 03:03:23PM -0600, Stuart Hayes wrote: > > > > On 11/12/19 3:59 PM, Stuart Hayes wrote: > > > The pciehp interrupt handler pciehp_isr() will read the slot status > > > register and then write back to it to clear just the bits that caused the > > > interrupt. If a different interrupt event bit gets set between the read and > > > the write, pciehp_isr() will exit without having cleared all of the > > > interrupt event bits, so we will never get another hotplug interrupt from > > > that device. > > > > > > That is expected behavior according to the PCI Express spec (v.5.0, section > > > 6.7.3.4, "Software Notification of Hot-Plug Events"). > > > > > > Because the "presence detect changed" and "data link layer state changed" > > > event bits are both getting set at nearly the same time when a device is > > > added or removed, this is more likely to happen than it might seem. The > > > issue can be reproduced rather easily by connecting and disconnecting an > > > NVMe device on at least one system model. > > > > > > This patch fixes the issue by modifying pciehp_isr() to loop back and > > > re-read the slot status register immediately after writing to it, until > > > it sees that all of the event status bits have been cleared. > > > > > > Signed-off-by: Stuart Hayes <stuart.w.hayes@xxxxxxxxx> > > > > Bjorn, > > > > Do you have any comments or issues with this patch set? Anything I can do? > > Were you planning to address Lukas' comments? > > https://lore.kernel.org/r/20191114025022.wz3gchr7w67fjtzn@xxxxxxxxx Yes, I submitted a V2 for this patch (https://lkml.org/lkml/2019/11/20/1147). But--I'm very sorry, I didn't mean to ask if you had any comments on this patch--I meant to ask about an earlier patch set, and accidentally replied to the wrong thread. I meant to ask you about this patch set: https://lore.kernel.org/lkml/20191025190047.38130-1-stuart.w.hayes@xxxxxxxxx/ Thank you! --Stuart