On Tue, 25 Oct 2022, Bjorn Helgaas wrote:
In short that calls: /* Allocate the maximum possible number of MSI/MSI-X vectors */ nr_entries = pci_alloc_irq_vectors(dev, 1, PCIE_PORT_MAX_MSI_ENTRIES, PCI_IRQ_MSIX | PCI_IRQ_MSI); /* See how many and which Interrupt Message Numbers we actually use */ nvec = pcie_message_numbers(dev, mask, &pme, &aer, &dpc); if (nvec != nr_entries) { pci_free_irq_vectors(dev); nr_entries = pci_alloc_irq_vectors(dev, nvec, nvec, PCI_IRQ_MSIX | PCI_IRQ_MSI); } My worry here is that the implicit assumption is that the vectors won't move if we reduce the overall number of vectors we are asking for...
This would also apply to what is currently in portdrv machinery, no?
However, imagine the case that we have a feature the driver doesn't know about that was previously at a higher vector. After reducing the vectors allocated the hardware might decide that feature needs its own vector whereas some others can be combined. Hence we'd end up with a less than ideal packing for the features we actually support. Could do something iterative to solve this if it actually matters (increase number of vectors until the layout matches what we get with max possible vectors).
Maybe do a bounded retry loop until we get stable value? retry = 1; do { pci_alloc_irq_vectors(1, 32); nvecs = get_max_msgnum(); // max(pmu, events, mbox, isolation) pci_free_irq_vectors(); pci_alloc_irq_vectors(nvecs, nvecs); new_nvecs = get_max_msgnum(); if (likely(new_nvecs == nvecs)) return 0; pci_free_irq_vectors(); } while (retry--); return -1; // no irq support But yeah I'm not sure how much we actually care about this. But if so, it also might be worth re-visiting the generic table thing, as if nothing else it can standalone co-exist and avoid allocating any irqs altogether if we know a-priori that there is no irq support.
Is this cxl code allocating vectors for devices that might also be claimed by portdrv? I assume not because that sounds like a problem. Ugh. I always feel like the portdrv design must be sub-optimal because this seems so hard to do cleanly. pci_alloc_irq_vectors() has a lot of magic inside it and is great for most drivers, but the PCIe service IRQs are definitely unusual and maybe it's not the best fit for this situation. If I understand correctly, Interrupt Message Numbers for all these PCIe services (hotplug, AER, DPC, etc) are restricted to 0-31 for both MSI and MSI-X, and the reason we don't just allocate 32 vectors all the time is to avoid consuming too many IRQs.
Most CXL features that can have irqs will normally use only the first 16, with the exception of isolation (cxl 3.0), which per the spec is up to 32.
The MSI case is ugly because the Interrupt Message Number can change when we set Multiple Message Enable. Maybe we can separate it out and have a less than optimal solution for this case, like allocating one or two vectors and polling if that's not enough. I expect most devices will support MSI-X.
Would only supporting MSI-X be so terrible? Thanks, Davidlohr