Re: pcie-iproc-msi.c: Bug in Multi-MSI support?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 20 May 2021 18:11:32 +0100,
Ray Jui <ray.jui@xxxxxxxxxxxx> wrote:
>
> On 5/20/2021 7:22 AM, Sandor Bodo-Merle wrote:
> > On Thu, May 20, 2021 at 4:05 PM Pali Rohár <pali@xxxxxxxxxx> wrote:
> >>
> >> Hello!
> >>
> >> On Thursday 20 May 2021 15:47:46 Sandor Bodo-Merle wrote:
> >>> Hi Pali,
> >>>
> >>> thanks for catching this - i dig up the followup fixup commit we have
> >>> for the iproc multi MSI (it was sent to Broadcom - but unfortunately
> >>> we missed upstreaming it).
> >>>
> >>> Commit fc54bae28818 ("PCI: iproc: Allow allocation of multiple MSIs")
> >>> failed to reserve the proper number of bits from the inner domain.
> >>> We need to allocate the proper amount of bits otherwise the domains for
> >>> multiple PCIe endpoints may overlap and freeing one of them will result
> >>> in freeing unrelated MSI vectors.
> >>>
> >>> Fixes: fc54bae28818 ("PCI: iproc: Allow allocation of multiple MSIs")
> >>> ---
> >>>  drivers/pci/host/pcie-iproc-msi.c | 8 ++++----
> >>>  1 file changed, 4 insertions(+), 4 deletions(-)
> >>>
> >>> diff --git drivers/pci/host/pcie-iproc-msi.c drivers/pci/host/pcie-iproc-msi.c
> >>> index 708fdb1065f8..a00492dccb74 100644
> >>> --- drivers/pci/host/pcie-iproc-msi.c
> >>> +++ drivers/pci/host/pcie-iproc-msi.c
> >>> @@ -260,11 +260,11 @@ static int iproc_msi_irq_domain_alloc(struct
> >>> irq_domain *domain,
> >>>
> >>>         mutex_lock(&msi->bitmap_lock);
> >>>
> >>> -       /* Allocate 'nr_cpus' number of MSI vectors each time */
> >>> +       /* Allocate 'nr_irqs' multiplied by 'nr_cpus' number of MSI
> >>> vectors each time */
> >>>         hwirq = bitmap_find_next_zero_area(msi->bitmap, msi->nr_msi_vecs, 0,
> >>> -                                          msi->nr_cpus, 0);
> >>> +                                          msi->nr_cpus * nr_irqs, 0);
> >>
> >> I'm not sure if this construction is correct. Multi-MSI interrupts needs
> >> to be aligned to number of requested interrupts. So if wifi driver asks
> >> for 32 Multi-MSI interrupts then first allocated interrupt number must
> >> be dividable by 32.
> >>
> > 
> > Ahh - i guess you are right. In our internal engineering we always
> > request 32 vectors.
> > IIRC the multiply by "nr_irqs" was added for iqr affinity to work correctly.
> > 
> 
> May I ask which platforms are you guys running this driver on? Cygnus or
> Northstar? Not that it matters, but just out of curiosity.
> 
> Let me start by explaining how MSI support works in this driver, or more
> precisely, for all platforms that support this iProc based event queue
> MSI scheme:
> 
> In iProc PCIe core, each MSI group is serviced by a GIC interrupt
> (hwirq) and a dedicated event queue (event queue is paired up with
> hwirq).  Each MSI group can support up to 64 MSI vectors. Note 64 is the
> depth of the event queue.
> 
> The number of MSI groups varies between different iProc SoCs. The total
> number of CPU cores also varies. To support MSI IRQ affinity, we
> distribute GIC interrupts across all available CPUs.  MSI vector is
> moved from one GIC interrupt to another to steer to the target CPU.
> 
> Assuming:
> The number of MSI groups (the number of total hwirq for this PCIe
> controller) is M
> The number of CPU cores is N
> M is always a multiple of N (we ensured that in the setup function)
> 
> Therefore:
> Total number of raw MSI vectors = M * 64
> Total number of supported MSI vectors = (M * 64) / N
> 
> I guess I'm not too clear on what you mean by "multi-MSI interrupts
> needs to be aligned to number of requested interrupts.". Would you be
> able to plug this into the above explanation so we can have a more clear
> understanding of what you mean here?

That's a generic PCI requirement: if you are providing a Multi-MSI
configuration, the base vector number has to be size-aligned
(2-aligned for 2 MSIs, 4 aligned for 4, up to 32), and the end-point
supplies up to 5 bits that are orr-ed into the base vector number,
with a *single* doorbell address. You effectively provide a single MSI
number and a single address, and the device knows how to drive 2^n MSIs.

This is different from MSI-X, which defines multiple individual
vectors, each with their own doorbell address.

The main problem you have here (other than the broken allocation
mechanism) is that moving an interrupt from one core to another
implies moving the doorbell address to that of another MSI
group. This isn't possible for Multi-MSI, as all the MSIs must have
the same doorbell address. As far as I can see, there is no way to
support Multi-MSI together with affinity change on this HW, and you
should stop advertising support for this feature.

There is also a more general problem here, which is the atomicity of
the update on affinity change. If you are moving an interrupt from one
CPU to the other, it seems you change both the vector number and the
target address. If that is the case, this isn't atomic, and you may
end-up with the device generating a message based on a half-applied
update.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux