Rafael J. Wysocki wrote: > On Saturday 17 January 2009, Rafael J. Wysocki wrote: (snip) >> If MSI-X are supported, it allocates as many vectors as there are entries >> in the port's MSI-X table, but no more than 32, and figures out which of them >> will be used for the port services. > > The patch didn't check which services are available during the MSI-X set up > which was wrong. > > Also, in the meantime, i thought it might be a good idea to free the interrupt > routing table entries that aren't going to be used after all. > > The patch below adds this to the previous version and checks for the > availability of port services in the MSI-X setup resume. I hope it will > be acceptable to everyone. > > Thanks, > Rafael > > --- > Subject: PCI PCIe portdrv: Fix allocation of interrupts (rev. 5) > From: Rafael J. Wysocki <rjw@xxxxxxx> > > If MSI-X interrupt mode is used by the PCI Express port driver, too > many vectors are allocated and it is not ensured that the right > vectors will be used for the right services. Namely, the PCI Express > specification states that both PCI Express native PME and PCI Express > hotplug will always use the same MSI or MSI-X message for signalling > interrupts, which implies that the same vector will be used by both > of them. Also, the VC service does not use interrupts at all. > Moreover, is not clear which of the vectors allocated by > pci_enable_msix() in the current code will be used for PME and > hotplug and which of them will be used for AER if all of these > services are configured. > > For these reasons, rework the allocation of interrupts for PCI > Express ports so that if MSI-X are enabled, the right vectors will be > used for the right purposes. > > Signed-off-by: Rafael J. Wysocki <rjw@xxxxxxx> > --- > drivers/pci/msi.c | 24 +++- > drivers/pci/pcie/portdrv.h | 6 + > drivers/pci/pcie/portdrv_core.c | 195 ++++++++++++++++++++++++++++++++-------- > include/linux/pci.h | 5 + > include/linux/pcieport_if.h | 12 +- > 5 files changed, 194 insertions(+), 48 deletions(-) > > Index: linux-2.6/drivers/pci/pcie/portdrv_core.c > =================================================================== > --- linux-2.6.orig/drivers/pci/pcie/portdrv_core.c > +++ linux-2.6/drivers/pci/pcie/portdrv_core.c > @@ -31,6 +31,141 @@ static void release_pcie_device(struct d > } > > /** > + * pcie_port_msix_add_entry - add entry to given array of MSI-X entries > + * @entries: Array of MSI-X entries > + * @new_entry: Index of the entry to add to the array > + * @nr_entries: Number of entries aleady in the array > + * > + * Return value: Position of the added entry in the array > + */ > +static int pcie_port_msix_add_entry( > + struct msix_entry *entries, int new_entry, int nr_entries) > +{ > + int j; > + > + for (j = 0; j < nr_entries; j++) > + if (entries[j].entry == new_entry) > + return j; > + > + entries[j].entry = new_entry; > + return j; > +} > + > +/** > + * pcie_port_enable_msix - try to set up MSI-X as interrupt mode for given port > + * @dev: PCI Express port to handle > + * @vectors: Array of interrupt vectors to populate > + * @mask: Bitmask of port capabilities returned by get_port_device_capability() > + * > + * Return value: 0 on success, error code on failure > + */ > +static int pcie_port_enable_msix(struct pci_dev *dev, int *vectors, int mask) > +{ > + struct msix_entry *msix_entries; > + int idx[PCIE_PORT_DEVICE_MAXSERVICES]; > + int nr_entries, status, pos, i, nvec; > + u16 reg16; > + u32 reg32; > + > + nr_entries = pci_msix_table_size(dev); > + if (!nr_entries) > + return -EINVAL; > + if (nr_entries > PCIE_PORT_MAX_MSIX_ENTRIES) > + nr_entries = PCIE_PORT_MAX_MSIX_ENTRIES; > + > + msix_entries = kzalloc(sizeof(*msix_entries) * nr_entries, GFP_KERNEL); > + if (!msix_entries) > + return -ENOMEM; > + > + /* > + * Allocate as many entries as the device wants temporarily, so that we > + * can check which of them will be useful. > + */ > + for (i = 0; i < nr_entries; i++) > + msix_entries[i].entry = i; /* * So, if msix_entries is correctly equal to the number of entries this * port actually uses, we'll happily go through without using trick. */ > + > + status = pci_enable_msix(dev, msix_entries, nr_entries); > + if (status) > + goto Exit; > + > + for (i = 0; i < PCIE_PORT_DEVICE_MAXSERVICES; i++) > + idx[i] = -1; > + status = -EIO; > + nvec = 0; > + > + if (mask & (PCIE_PORT_SERVICE_PME | PCIE_PORT_SERVICE_HP)) { > + int entry; > + > + /* > + * The code below follows the PCI Express Base Specification 2.0 > + * stating in Section 6.1.6 that "PME and Hot-Plug Event > + * interrupts (when both are implemented) always share the same > + * MSI or MSI-X vector, as indicated by the Interrupt Message > + * Number field in the PCI Express Capabilities register", where > + * according to Section 7.8.2 of the specification "For MSI-X, > + * the value in this field indicates which MSI-X Table entry is > + * used to generate the interrupt message." > + */ > + pos = pci_find_capability(dev, PCI_CAP_ID_EXP); > + pci_read_config_word(dev, pos + PCIE_CAPABILITIES_REG, ®16); > + entry = (reg16 >> 9) & PCIE_PORT_MSI_VECTOR_MASK; > + if (entry >= nr_entries) > + goto Error; > + > + i = pcie_port_msix_add_entry(msix_entries, entry, nvec); > + if (i == nvec) > + nvec++; > + > + idx[PCIE_PORT_SERVICE_PME_SHIFT] = i; > + idx[PCIE_PORT_SERVICE_HP_SHIFT] = i; > + } > + > + if (mask & PCIE_PORT_SERVICE_AER) { > + int entry; > + > + /* > + * The code below follows Section 7.10.10 of the PCI Express > + * Base Specification 2.0 stating that bits 31-27 of the Root > + * Error Status Register contain a value indicating which of the > + * MSI/MSI-X vectors assigned to the port is going to be used > + * for AER, where "For MSI-X, the value in this register > + * indicates which MSI-X Table entry is used to generate the > + * interrupt message." > + */ > + pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_ERR); > + pci_read_config_dword(dev, pos + PCI_ERR_ROOT_STATUS, ®32); > + entry = reg32 >> 27; > + if (entry >= nr_entries) > + goto Error; > + > + i = pcie_port_msix_add_entry(msix_entries, entry, nvec); > + if (i == nvec) > + nvec++; > + > + idx[PCIE_PORT_SERVICE_AER_SHIFT] = i; > + } > + /* Are there any unused entries? */ if (nr_allocated > nvec) { /* this port have extra entries not for services we know... */ > + /* Drop the temporary MSI-X setup */ > + pci_disable_msix(dev); > + > + /* Now allocate the MSI-X vectors for real */ > + status = pci_enable_msix(dev, msix_entries, nvec); > + if (status) > + goto Error; /* * World have broken hardwares, so even spec says numbers are constant, * it would be better to re-check registers after 2nd pci_enable_msix. * Or we just skip this. (However this was what your concern, Rafael?) */ if (func_foo_do_paranoia_check(dev, msix_entries, nvec)) goto Error; } > + > + for (i = 0; i < PCIE_PORT_DEVICE_MAXSERVICES; i++) > + vectors[i] = idx[i] >= 0 ? msix_entries[idx[i]].vector : -1; > + > + Exit: > + kfree(msix_entries); > + return status; > + > + Error: > + pci_disable_msix(dev); > + goto Exit; > +} > + > +/** > * assign_interrupt_mode - choose interrupt mode for PCI Express port services > * (INTx, MSI-X, MSI) and set up vectors > * @dev: PCI Express port to handle > @@ -42,49 +177,31 @@ static void release_pcie_device(struct d > static int assign_interrupt_mode(struct pci_dev *dev, int *vectors, int mask) > { > struct pcie_port_data *port_data = pci_get_drvdata(dev); > - int i, pos, nvec, status = -EINVAL; > - int interrupt_mode = PCIE_PORT_NO_IRQ; > - > - /* Set INTx as default */ > - for (i = 0, nvec = 0; i < PCIE_PORT_DEVICE_MAXSERVICES; i++) { > - if (mask & (1 << i)) > - nvec++; > - vectors[i] = dev->irq; > - } > - if (dev->pin) > - interrupt_mode = PCIE_PORT_INTx_MODE; > + int irq, interrupt_mode = PCIE_PORT_NO_IRQ; > + int i; > > /* Check MSI quirk */ > if (port_data->port_type == PCIE_RC_PORT && pcie_mch_quirk) > - return interrupt_mode; > + goto Fallback; > + > + /* Try to use MSI-X if supported */ > + if (!pcie_port_enable_msix(dev, vectors, mask)) > + return PCIE_PORT_MSIX_MODE; > + > + /* We're not going to use MSI-X, so try MSI and fall back to INTx */ > + if (!pci_enable_msi(dev)) > + interrupt_mode = PCIE_PORT_MSI_MODE; > + > + Fallback: > + if (interrupt_mode == PCIE_PORT_NO_IRQ && dev->pin) > + interrupt_mode = PCIE_PORT_INTx_MODE; > + > + irq = interrupt_mode != PCIE_PORT_NO_IRQ ? dev->irq : -1; > + for (i = 0; i < PCIE_PORT_DEVICE_MAXSERVICES; i++) > + vectors[i] = irq; > + > + vectors[PCIE_PORT_SERVICE_VC_SHIFT] = -1; > > - /* Select MSI-X over MSI if supported */ > - pos = pci_find_capability(dev, PCI_CAP_ID_MSIX); > - if (pos) { > - struct msix_entry msix_entries[PCIE_PORT_DEVICE_MAXSERVICES] = > - {{0, 0}, {0, 1}, {0, 2}, {0, 3}}; > - status = pci_enable_msix(dev, msix_entries, nvec); > - if (!status) { > - int j = 0; > - > - interrupt_mode = PCIE_PORT_MSIX_MODE; > - for (i = 0; i < PCIE_PORT_DEVICE_MAXSERVICES; i++) { > - if (mask & (1 << i)) > - vectors[i] = msix_entries[j++].vector; > - } > - } > - } > - if (status) { > - pos = pci_find_capability(dev, PCI_CAP_ID_MSI); > - if (pos) { > - status = pci_enable_msi(dev); > - if (!status) { > - interrupt_mode = PCIE_PORT_MSI_MODE; > - for (i = 0;i < PCIE_PORT_DEVICE_MAXSERVICES;i++) > - vectors[i] = dev->irq; > - } > - } > - } > return interrupt_mode; > } > > Index: linux-2.6/include/linux/pcieport_if.h > =================================================================== > --- linux-2.6.orig/include/linux/pcieport_if.h > +++ linux-2.6/include/linux/pcieport_if.h > @@ -16,10 +16,14 @@ > #define PCIE_ANY_PORT 7 > > /* Service Type */ > -#define PCIE_PORT_SERVICE_PME 1 /* Power Management Event */ > -#define PCIE_PORT_SERVICE_AER 2 /* Advanced Error Reporting */ > -#define PCIE_PORT_SERVICE_HP 4 /* Native Hotplug */ > -#define PCIE_PORT_SERVICE_VC 8 /* Virtual Channel */ > +#define PCIE_PORT_SERVICE_PME_SHIFT 0 /* Power Management Event */ > +#define PCIE_PORT_SERVICE_PME (1 << PCIE_PORT_SERVICE_PME_SHIFT) > +#define PCIE_PORT_SERVICE_AER_SHIFT 1 /* Advanced Error Reporting */ > +#define PCIE_PORT_SERVICE_AER (1 << PCIE_PORT_SERVICE_AER_SHIFT) > +#define PCIE_PORT_SERVICE_HP_SHIFT 2 /* Native Hotplug */ > +#define PCIE_PORT_SERVICE_HP (1 << PCIE_PORT_SERVICE_HP_SHIFT) > +#define PCIE_PORT_SERVICE_VC_SHIFT 3 /* Virtual Channel */ > +#define PCIE_PORT_SERVICE_VC (1 << PCIE_PORT_SERVICE_VC_SHIFT) > > /* Root/Upstream/Downstream Port's Interrupt Mode */ > #define PCIE_PORT_NO_IRQ (-1) > Index: linux-2.6/drivers/pci/pcie/portdrv.h > =================================================================== > --- linux-2.6.orig/drivers/pci/pcie/portdrv.h > +++ linux-2.6/drivers/pci/pcie/portdrv.h > @@ -25,6 +25,12 @@ > #define PCIE_CAPABILITIES_REG 0x2 > #define PCIE_SLOT_CAPABILITIES_REG 0x14 > #define PCIE_PORT_DEVICE_MAXSERVICES 4 > +#define PCIE_PORT_MSI_VECTOR_MASK 0x1f > +/* > + * According to the PCI Express Base Specification 2.0, the indices of the MSI-X > + * table entires used by port services must not exceed 31 > + */ > +#define PCIE_PORT_MAX_MSIX_ENTRIES 32 > > #define get_descriptor_id(type, service) (((type - 4) << 4) | service) > > Index: linux-2.6/drivers/pci/msi.c > =================================================================== > --- linux-2.6.orig/drivers/pci/msi.c > +++ linux-2.6/drivers/pci/msi.c > @@ -670,6 +670,23 @@ static int msi_free_irqs(struct pci_dev* > } > > /** > + * pci_msix_table_size - return the number of device's MSI-X table entries > + * @dev: pointer to the pci_dev data structure of MSI-X device function > + */ > +int pci_msix_table_size(struct pci_dev *dev) > +{ > + int pos; > + u16 control; > + > + pos = pci_find_capability(dev, PCI_CAP_ID_MSIX); > + if (!pos) > + return 0; > + > + pci_read_config_word(dev, msi_control_reg(pos), &control); > + return multi_msix_capable(control); > +} > + > +/** I think this pci_msix_table_size() is useful alone. It would be nice if we can have separated patches. i.e.: [PATCH] PCI/MSI: introduce pci_msix_table_size() [PATCH] PCI PCIe portdrv: Fix allocation of interrupts (rev. 6) Thanks, H.Seto > * pci_enable_msix - configure device's MSI-X capability structure > * @dev: pointer to the pci_dev data structure of MSI-X device function > * @entries: pointer to an array of MSI-X entries > @@ -686,9 +703,8 @@ static int msi_free_irqs(struct pci_dev* > **/ > int pci_enable_msix(struct pci_dev* dev, struct msix_entry *entries, int nvec) > { > - int status, pos, nr_entries; > + int status, nr_entries; > int i, j; > - u16 control; > > if (!entries) > return -EINVAL; > @@ -697,9 +713,7 @@ int pci_enable_msix(struct pci_dev* dev, > if (status) > return status; > > - pos = pci_find_capability(dev, PCI_CAP_ID_MSIX); > - pci_read_config_word(dev, msi_control_reg(pos), &control); > - nr_entries = multi_msix_capable(control); > + nr_entries = pci_msix_table_size(dev); > if (nvec > nr_entries) > return -EINVAL; > > Index: linux-2.6/include/linux/pci.h > =================================================================== > --- linux-2.6.orig/include/linux/pci.h > +++ linux-2.6/include/linux/pci.h > @@ -799,6 +799,10 @@ static inline void pci_msi_shutdown(stru > static inline void pci_disable_msi(struct pci_dev *dev) > { } > > +static inline int pci_msix_table_size(struct pci_dev *dev) > +{ > + return 0; > +} > static inline int pci_enable_msix(struct pci_dev *dev, > struct msix_entry *entries, int nvec) > { > @@ -823,6 +827,7 @@ static inline int pci_msi_enabled(void) > extern int pci_enable_msi(struct pci_dev *dev); > extern void pci_msi_shutdown(struct pci_dev *dev); > extern void pci_disable_msi(struct pci_dev *dev); > +extern int pci_msix_table_size(struct pci_dev *dev); > extern int pci_enable_msix(struct pci_dev *dev, > struct msix_entry *entries, int nvec); > extern void pci_msix_shutdown(struct pci_dev *dev); > > -- > To unsubscribe from this list: send the line "unsubscribe linux-pci" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html