Bjorn Helgaas <bhelgaas@xxxxxxxxxx> writes: > On Wed, Oct 19, 2011 at 5:02 PM, Bjorn Helgaas <bhelgaas@xxxxxxxxxx> wrote: >> On Mon, Oct 17, 2011 at 12:46 PM, Eric W. Biederman >> <ebiederm@xxxxxxxxxxxx> wrote: >>> >>> I traced a nasty kexec on panic boot failure to the fact that we had >>> screaming msi interrupts and we were not disabling the msi messages at >>> kernel startup. The booting kernel had not enabled those interupts so >>> was not prepared to handle them. >>> >>> I can see no reason why we would ever want to leave the msi interrupts >>> enabled at boot if something else has enabled those interrupts. The pci >>> spec specifies that msi interrupts should be off by default. Drivers >>> are expected to enable the msi interrupts if they want to use them. Our >>> interrupt handling code reprograms the interrupt handlers at boot and >>> will not be be able to do anything useful with an unexpected interrupt. >>> >>> This patch applies cleanly all of the way back to 2.6.32 where I noticed >>> the problem. >>> >>> Cc: stable@xxxxxxxxxx >>> >>> Signed-off-by: Eric W. Biederman <ebiederm@xxxxxxxxxxxx> >>> --- >>> drivers/pci/msi.c | 10 ++++++++++ >>> 1 files changed, 10 insertions(+), 0 deletions(-) >>> >>> diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c >>> index 2f10328..e174982 100644 >>> --- a/drivers/pci/msi.c >>> +++ b/drivers/pci/msi.c >>> @@ -869,5 +869,15 @@ EXPORT_SYMBOL(pci_msi_enabled); >>> >>> void pci_msi_init_pci_dev(struct pci_dev *dev) >>> { >>> + int pos; >>> INIT_LIST_HEAD(&dev->msi_list); >>> + >>> + /* Disable the msi hardware to avoid screaming interrupts >>> + * during boot. This is the power on reset default so >>> + * usually this should be a noop. >>> + */ >>> + pos = pci_find_capability(dev, PCI_CAP_ID_MSI); >>> + if (pos) >>> + msi_set_enable(dev, pos, 0); >>> + msix_set_enable(dev, 0); >>> } >> >> We only build msi.o if CONFIG_PCI_MSI=y, but don't you want to disable >> these even if the new kernel has CONFIG_PCI_MSI=n? > > Since problems like this are so nasty to debug, maybe it'd be worth a > note in dmesg if we found MSI enabled? I guess I don't know exactly > what we'd do with the information, but sometimes we can't predict in > advance what things are useful. It was nasty because it was hard to get an interrupt to scream, and then it was just plain weird as it was an oversight in our logic. Once I enabled debugging output it was very clear I had a screaming interrupt. A warn once might not be a bad idea. I would hate to print this for every msi-x capable pci device in the system as we boot up. Still once we have my patch applied the situation will be handled so I don't know of any reason why we would care if msi's were enabled at boot up. And if there is an irq screaming we do have pretty good messages. This one was just a little extra tricky because the kernel wasn't expecting these interrupts. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html