On 9/12/2018 4:28 PM, Bjorn Helgaas wrote: > On Mon, Jul 30, 2018 at 04:21:44PM -0500, Alexandru Gagniuc wrote: >> When a PCI device is gone, we don't want to send IO to it if we can >> avoid it. We expose functionality via the irq_chip structure. As >> users of that structure may not know about the underlying PCI device, >> it's our responsibility to guard against removed devices. > > I'm pretty ambivalent about pci_dev_is_disconnected() in general, but > I think I'll take this, given a couple minor changelog clarifications: > >> irq_write_msi_msg is already guarded. pci_msi_(un)mask_irq are not. >> Guard them for completeness. > > By the irq_write_msi_msg() guard, I guess you mean this path: > > pci_msi_domain_write_msg # irq_chip.irq_write_msi_msg > __pci_write_msi_msg > if (dev->current_state != PCI_D0 || pci_dev_is_disconnected(dev)) > /* don't touch */ Yes! > pci_msi_(un)mask_irq() may be irq_chip.irq_mask, .irq_unmask, etc > pointers. So these are parallel because they're all irq_chip function > pointers, but the changelog isn't (yet) parallel because it uses the > irq_chip pointer name for .irq_write_msi_msg but not for mask/unmask Good catch! I'll get this corrected. >> For example, surprise removal of a PCIe device triggers teardown. This >> touches the irq_chips ops some point to disable the interrupts. I/O >> generated here can crash the system on machines with buggy firmware. >> Not triggering the IO in the first place eliminates the problem. > > It doesn't eliminate the problem completely because .irq_mask() and > .irq_unmask() may be called for reasons other than surprise removal, > and if a surprise removal happens after the pci_dev_is_disconnected() > check but before the readl(), we will still generate I/O to a device > that's gone. I'd be OK if you said it "reduces" the problem. That sounds reasonable. > One reason I'm ambivalent about pci_dev_is_disconnected() is that in > cases like this, it turns a reproducible problem into a very > hard-to-reproduce problem, which reduces the likelihood that the buggy > firmware will be fixed. If it manages to turn this into 99.999% territory, I'll be much happier. I'd love to give you an academically correct solution, but I just don't see how, given how firmware-first philosophy is written. > Do you have information about known platforms with this buggy firmware > and the signature of the crash? If you do, it's always nice to be > able to connect a patch with the user-visible problem it fixes. From what I've heard, it won't be fixed. The number of changes needed would require re-qualifying the firmware. I'm told that's very hard to do on platforms that are shipping. I can reword this to say "firmware-first" instead of "buggy" since they are mostly synonymous. Alex >> Signed-off-by: Alexandru Gagniuc <mr.nuke.me@xxxxxxxxx> >> --- >> >> There's another patch by Lukas Wunner that is needed (not yet published) >> in order to fully block IO on SURPRISE!!! removal. The existing code only >> sets the PCI_DEV_DISCONNECTED bit in an unreasonably narrow set of >> circumstances. Lukas' patch fixes that. >> >> However, this change is otherwise fully independent, and enjoy! >> >> drivers/pci/msi.c | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c >> index 4d88afdfc843..5f47b5cb0401 100644 >> --- a/drivers/pci/msi.c >> +++ b/drivers/pci/msi.c >> @@ -227,6 +227,9 @@ static void msi_set_mask_bit(struct irq_data *data, u32 flag) >> { >> struct msi_desc *desc = irq_data_get_msi_desc(data); >> >> + if (pci_dev_is_disconnected(msi_desc_to_pci_dev(desc))) >> + return; >> + >> if (desc->msi_attrib.is_msix) { >> msix_mask_irq(desc, flag); >> readl(desc->mask_base); /* Flush write to device */ >> -- >> 2.17.1 >> >