On Fri, 25 Jul 2008 07:42:52 -0600 Matthew Wilcox <matthew@xxxxxx> wrote: > On Fri, Jul 25, 2008 at 03:29:18PM +0200, Michal Schmidt wrote: > > The interesting thing is that I can see Destination ID bits of MSI > > Message Address change correctly in lspci output. But the interrupt > > is still delivered load-balanced to all CPUs even though the > > Destination ID identifies the single CPU I asked for. It seems the > > device only takes the new Message Address setting into account when > > the MSI Enable bit in the Message Control register is changed from > > 0 to 1. I tested this by setting the MSI enable bit to 0 and then > > immediately back to 1 at the end of > > io_apic_64.c:set_msi_irq_affinity(). > > > > Is this a permitted behaviour for the device? I couldn't find > > anything in the PCI specification that would mentioned it. > > I don't think that's necessary. However, the thought occurs that we > ought to disable MSI, then write the address, then re-enable MSI. It > doesn't cause a problem at the moment because we don't change the > top 32 bits of the address (at least on any of my systems ..) but > theoretically if we were to use a 64-bit address, we would experience > MSIs being sent to an address that was a mixture of the top 32 bits of > the old address and the bottom 32 bits of the new address. > > We definitely can already get tearing when we've written the lower > address register but not the data register yet (also true for MSIX, by > the way). So we ought to fix this properly. > > We have the problem that we might still get interrupts on the old > pin-based interrupt line (ie David's original problem). I have a > feeling somebody needs to register a handler for the pin-based > interrupt to handle this. One possibility would be for the MSI code > to register a handler that calls the driver's MSI handler. I don't > think that's a good idea though -- the driver's MSI handler is able > to make different assumptions from the pin handler. Do we want to > make drivers register an interrupt handler for the original interrupt > number before they try to set up MSI? It's certainly not what the > PCI spec people had in mind, but they seem to have overlooked this > problem. > > Yuck. Maybe we should just not use MSI for devices without maskbits. What would you think of this patch?: This adds the parameter pci=msimaskbits to enable MSI only for devices with maskbits. It is useful when setting of IRQ SMP affinity is needed, because smp_affinity does not work reliably for MSI interrupts of devices without maskbits. And maybe this behaviour should be the default? Michal diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index 15af618..2771356 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -23,6 +23,7 @@ #include "pci.h" #include "msi.h" +/* 0 = disabled, 1 = enabled, 2 = enabled only for devices with mask bits */ static int pci_msi_enable = 1; /* Arch hooks */ @@ -356,8 +357,13 @@ static int msi_capability_init(struct pci_dev *dev) msi_set_enable(dev, 0); /* Ensure msi is disabled as I set it up */ - pos = pci_find_capability(dev, PCI_CAP_ID_MSI); + pos = pci_find_capability(dev, PCI_CAP_ID_MSI); pci_read_config_word(dev, msi_control_reg(pos), &control); + if (pci_msi_enable == 2 && !is_mask_bit_support(control)) { + dev_info(&dev->dev, + "does not support maskbits, will not use MSI\n"); + return -ENODEV; + } /* MSI Entry Initialization */ entry = alloc_msi_entry(); if (!entry) @@ -749,6 +755,11 @@ void pci_no_msi(void) pci_msi_enable = 0; } +void pci_msi_require_mask_bits(void) +{ + pci_msi_enable = 2; +} + void pci_msi_init_pci_dev(struct pci_dev *dev) { INIT_LIST_HEAD(&dev->msi_list); diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index d00f0e0..0e6019d 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -1868,6 +1868,8 @@ static int __devinit pci_setup(char *str) if (*str && (str = pcibios_setup(str)) && *str) { if (!strcmp(str, "nomsi")) { pci_no_msi(); + } else if (!strcmp(str, "msimaskbits")) { + pci_msi_require_mask_bits(); } else if (!strcmp(str, "noaer")) { pci_no_aer(); } else if (!strcmp(str, "nodomains")) { diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h index d807cd7..2af4750 100644 --- a/drivers/pci/pci.h +++ b/drivers/pci/pci.h @@ -85,9 +85,11 @@ extern unsigned int pci_pm_d3_delay; #ifdef CONFIG_PCI_MSI void pci_no_msi(void); +void pci_msi_require_mask_bits(void); extern void pci_msi_init_pci_dev(struct pci_dev *dev); #else static inline void pci_no_msi(void) { } +static inline void pci_msi_require_mask_bits(void) { } static inline void pci_msi_init_pci_dev(struct pci_dev *dev) { } #endif -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html