Hi, Bjorn and Marc, On Fri, May 26, 2023 at 5:40 AM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote: > > On Thu, May 25, 2023 at 05:14:28PM +0800, Huacai Chen wrote: > > On Wed, May 24, 2023 at 11:21 PM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote: > > > On Wed, May 24, 2023 at 05:36:23PM +0800, Huacai Chen wrote: > > > > Some platforms (such as LoongArch) cannot provide enough irq numbers as > > > > many as logical cpu numbers. So we should limit pci irq numbers when > > > > allocate msi/msix vectors, otherwise some device drivers may fail at > > > > initialization. This patch add a cmdline parameter "pci_irq_limit=xxxx" > > > > to control the limit. > > > > > > > > The default pci msi/msix number limit is defined 32 for LoongArch and > > > > NR_IRQS for other platforms. > > > > > > The IRQ experts can chime in on this, but this doesn't feel right to > > > me. I assume arch code should set things up so only valid IRQ numbers > > > can be allocated. This doesn't seem necessarily PCI-specific, I'd > > > prefer to avoid an arch #ifdef here, and I'd also prefer to avoid a > > > command-line parameter that users have to discover and supply. > > > > The problem we meet: LoongArch machines can have as many as 256 > > logical cpus, and the maximum of msi vectors is 192. Even on a 64-core > > machine, 192 irqs can be easily exhausted if there are several NICs > > (NIC usually allocates msi irqs depending on the number of online > > cpus). So we want to limit the msi allocation. > > > > This is not a LoongArch-specific problem, because I think other > > platforms can also meet if they have many NICs. But of course, > > LoongArch can meet it more easily because the available msi vectors > > are very few. So, adding a cmdline parameter is somewhat reasonable. > > The patch contains "#ifdef CONFIG_LOONGARCH", which makes this > solution LoongArch-specific. I'm not willing for that yet. > > It sounds like the LoongArch MSI limit is known at compile-time, or at > least at boot-time, so the kernel ought to be able to figure out what > to do without a command-line parameter. > > > After some investigation, I think it may be possible to modify > > drivers/irqchip/irq-loongson-pch-msi.c and override > > msi_domain_info::domain_alloc_irqs() to limit msi allocation. However, > > doing that need to remove the "static" before > > __msi_domain_alloc_irqs(), which means revert > > 762687ceb31fc296e2e1406559e8bb5 ("genirq/msi: Make > > __msi_domain_alloc_irqs() static"), I don't know whether that is > > acceptable. > > I guess you mean msi_domain_ops::domain_alloc_irqs() (not > msi_domain_info). If this is really a generic problem, I'm surprised > that no other arch has needed to override .domain_alloc_irqs(). Yes, I mean msi_domain_ops::domain_alloc_irqs() here. > > I think you'll have better luck getting feedback if you can post the > complete working patch. At the very least, you'll learn more about > the problem by doing that. Emm, I found I can do some small modification on msi_domain_prepare_irqs(), and solve the problem by overriding msi_domain_ops::msi_prepare(), thanks. And patches is coming soon. Huacai > > Bjorn