On 23/10/2018 14:03, Bjorn Helgaas wrote: > On Mon, Oct 22, 2018 at 05:35:06PM -0300, Guilherme G. Piccoli wrote: >> On 18/10/2018 19:15, Bjorn Helgaas wrote: >>> On Thu, Oct 18, 2018 at 03:37:19PM -0300, Guilherme G. Piccoli wrote: >>> [...] >> I understand your point, but I think this is inherently an architecture >> problem. No matter what solution we decide for, it'll need to be applied >> in early boot time, like before the PCI layer gets initialized. > > This is the part I want to know more about. Apparently there's some > event X between early_quirks() and the PCI device enumeration, and we > must disable MSIs before X: > > setup_arch() > early_quirks() # arch/x86/kernel/early-quirks.c > early_pci_clear_msi() > ... > X > ... > pci_scan_root_bus_bridge() > ... > DECLARE_PCI_FIXUP_EARLY # drivers/pci/quirks.c > > I want to know specifically what X is. If we don't know what X is and > all we know is "we have to disable MSIs earlier than PCI init", then > we're likely to break things again in the future by changing the order > of disabling MSIs and whatever X is. > > Bjorn > Hi Bjorn (and all CCed), I'm sorry to necro-bump a thread >2 years later, but recent discussions led to a better understanding of this 'X' point, thanks to Thomas! For those that deleted this thread from their email clients, it's available in [0] - the summary is that we faced an IRQ storm really early in boot, due to a bogus PCIe device MSI behavior, when booting a kdump kernel. This led the machine to get stuck in the boot and we couldn't kdump. The solution hereby proposed is to clear MSI interrupts early in x86, if a parameter is provided. I don't have the reproducer anymore and it was pretty hard to reproduce in virtual environments. So, about the 'X' Bjorn, in another recent thread about IRQ storms [1], Thomas clarified that and after a brief discussion, it seems there's no better way to prevent the MSI storm other than clearing the MSI capability early in boot. As discussed both here and in thread [1], this is indeed a per-architecture issue (powerpc is not subject to that, due to a better FW reset mechanism), so I think we still could benefit in having this idea implemented upstream, at least in x86 (we could expand to other architectures if desired, in the future). As a "test" data point, this was implemented in Ubuntu (same 3 patches present in this series) for ~2 years and we haven't received bug reports - I'm saying that because I understand your concerns about expanding the early PCI quirks scope. Let me know your thoughts. I'd suggest all to read thread [1], which addresses a similar issue but in a different "moment" of the system boot and provides some more insight on why the early MSI clearing seems to make sense. Thanks, Guilherme [0] https://lore.kernel.org/linux-pci/20181018183721.27467-1-gpiccoli@xxxxxxxxxxxxx [1] https://lore.kernel.org/lkml/87y2js3ghv.fsf@xxxxxxxxxxxxxxxxxxxxxxx