On 18/10/2018 19:15, Bjorn Helgaas wrote: > On Thu, Oct 18, 2018 at 03:37:19PM -0300, Guilherme G. Piccoli wrote: > [...] > I don't want to expand the early quirk infrastructure unless there is > absolutely no other way to solve this. The early quirk stuff is > x86-specific, and it's not obvious that this problem is x86-only. > > This patch scans buses 0-255, but still only in domain 0, so it won't > help with even more complicated systems that use other domains. > > I'm not an IRQ expert, but it seems wrong to me that we are enabling > this interrupt before we're ready for it. The MSI should target an > IOAPIC. Can't that IOAPIC entry be masked until later? I guess the > kdump kernel doesn't know what MSI address the device might be using. > > Could the IRQ core be more tolerant of this somehow, e.g., if it > notices incoming interrupts with no handler, could it disable the > IOAPIC entry and fall back to polling periodically until a handler is > added? Hi Bjorn, thanks for your quick reply. I understand your point, but I think this is inherently an architecture problem. No matter what solution we decide for, it'll need to be applied in early boot time, like before the PCI layer gets initialized. So, I think a first step would be to split the solution "timing" in 2 possibilities: a) We could try to disable MSIs or whatever approach we take in the quiesce path of crash_kexec(), before the bootstrap of the kdump kernel. The pro is we could use PCI handlers to do it generically. The con is it'd touch that delicate shutdown path, from a broken kernel, and this is unreliable. Also, I've noticed changes in those crash paths usually gain huge amount of criticism by community, seems nobody wants to change a bit of this code, if not utterly necessary. b) Continue using an early boot approach. IMO, this would be per-arch by nature. Currently, powerpc for example does not suffer this issue due to their arch code performing a FW-aided PCI fundamental reset in the devices[0]. On the other hand, x86 has no generic fundamental reset infrastructure to my knowledge (we tried some alternatives, like a Bridge reset[1] that didn't work, or zeroing the the command register, which worked), but if we go with the IOAPIC way of handling this (which we tried a bit and failed too), it'll be even more arch-dependent, since IOAPIC is x86 concept. After discussing here internally, an alternative way for this MSI approach work without requiring the change in the early PCI infrastructure is to check if we're in kdump kernel and perform manually the full scan in that case, instead of changing the generic case as proposed here. This would still be x86-only, but again, it's difficult if not impossible to fix all archs using the same code here. Finally, about multi-domain PCI topologies, I've never saw it on x86, I wasn't aware that such things existed in x86 - but if required we can quickly extend the logic to contemplate it too. Thanks again, looking forward for you suggestions. Cheers, Guilherme [0] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/powerpc/platforms/powernv/pci-ioda.c#n3992 [1] Based in https://patchwork.kernel.org/patch/2562841, adapted to work in early boot time.