On Thu, May 25, 2023 at 05:14:28PM +0800, Huacai Chen wrote: > On Wed, May 24, 2023 at 11:21 PM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote: > > On Wed, May 24, 2023 at 05:36:23PM +0800, Huacai Chen wrote: > > > Some platforms (such as LoongArch) cannot provide enough irq numbers as > > > many as logical cpu numbers. So we should limit pci irq numbers when > > > allocate msi/msix vectors, otherwise some device drivers may fail at > > > initialization. This patch add a cmdline parameter "pci_irq_limit=xxxx" > > > to control the limit. > > > > > > The default pci msi/msix number limit is defined 32 for LoongArch and > > > NR_IRQS for other platforms. > > > > The IRQ experts can chime in on this, but this doesn't feel right to > > me. I assume arch code should set things up so only valid IRQ numbers > > can be allocated. This doesn't seem necessarily PCI-specific, I'd > > prefer to avoid an arch #ifdef here, and I'd also prefer to avoid a > > command-line parameter that users have to discover and supply. > > The problem we meet: LoongArch machines can have as many as 256 > logical cpus, and the maximum of msi vectors is 192. Even on a 64-core > machine, 192 irqs can be easily exhausted if there are several NICs > (NIC usually allocates msi irqs depending on the number of online > cpus). So we want to limit the msi allocation. > > This is not a LoongArch-specific problem, because I think other > platforms can also meet if they have many NICs. But of course, > LoongArch can meet it more easily because the available msi vectors > are very few. So, adding a cmdline parameter is somewhat reasonable. The patch contains "#ifdef CONFIG_LOONGARCH", which makes this solution LoongArch-specific. I'm not willing for that yet. It sounds like the LoongArch MSI limit is known at compile-time, or at least at boot-time, so the kernel ought to be able to figure out what to do without a command-line parameter. > After some investigation, I think it may be possible to modify > drivers/irqchip/irq-loongson-pch-msi.c and override > msi_domain_info::domain_alloc_irqs() to limit msi allocation. However, > doing that need to remove the "static" before > __msi_domain_alloc_irqs(), which means revert > 762687ceb31fc296e2e1406559e8bb5 ("genirq/msi: Make > __msi_domain_alloc_irqs() static"), I don't know whether that is > acceptable. I guess you mean msi_domain_ops::domain_alloc_irqs() (not msi_domain_info). If this is really a generic problem, I'm surprised that no other arch has needed to override .domain_alloc_irqs(). I think you'll have better luck getting feedback if you can post the complete working patch. At the very least, you'll learn more about the problem by doing that. Bjorn