Re: [RFC PATCH 0/9] pci: portdrv: Add auxiliary bus and register CXL PMUs (and aer)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 06 Sep 2024 12:11:36 +0200
Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:

> On Thu, Sep 05 2024 at 12:23, Jonathan Cameron wrote:
> >> Look at how the PCI device manages interrupts with the per device MSI
> >> mechanism. Forget doing this with either one of the legacy mechanisms.
> >> 
> >>   1) It creates a hierarchical interrupt domain and gets the required
> >>      resources from the provided parent domain. The PCI side does not
> >>      care whether this is x86 or arm64 and it neither cares whether the
> >>      parent domain does remapping or not. The only way it cares is about
> >>      the features supported by the different parent domains (think
> >>      multi-MSI).
> >>      
> >>   2) Driver side allocations go through the per device domain
> >> 
> >> That AUXbus is not any different. When the CPMU driver binds it wants to
> >> allocate interrupts. So instead of having a convoluted construct
> >> reaching into the parent PCI device, you simply can do:
> >> 
> >>   1) Let the cxl_pci driver create a MSI parent domain and set that in
> >>      the subdevice::msi::irqdomain pointer.
> >> 
> >>   2) Provide cxl_aux_bus_create_irqdomain() which allows the CPMU device
> >>      driver to create a per device interrupt domain.
> >> 
> >>   3) Let the CPMU driver create its per device interrupt domain with
> >>      the provided parent domain
> >> 
> >>   4) Let the CPMU driver allocate its MSI-X interrupts through the per
> >>      device domain
> >> 
> >> Now on the cxl_pci side the AUXbus parent interrupt domain allocation
> >> function does:
> >> 
> >>     if (!pci_msix_enabled(pdev))
> >>     	return pci_msix_enable_and_alloc_irq_at(pdev, ....);
> >>     else
> >>         return pci_msix_alloc_irq_at(pdev, ....);  
> 
> Ignore the above. Brainfart.
> 
> > I'm struggling to follow this suggestion
> > Would you expect the cxl_pci MSI parent domain to set it's parent as
> > msi_domain = irq_domain_create_hierarchy(dev_get_msi_domain(&pdev->dev),
> > 					 IRQ_DOMAIN_FLAG_MSI_PARENT,
> > 					 ...
> > which seems to cause a problem with deadlocks in __irq_domain_alloc_irqs()
> > or create a separate domain structure and provide callbacks that reach into
> > the parent domain as necessary?
> >
> > Or do I have this entirely wrong? I'm struggling to relate what existing
> > code like PCI does to what I need to do here.  
> 
> dev_get_msi_domain(&pdev->dev) is a nightmare due to the 4 different
> models we have:
> 
>    - Legacy has no domain
> 
>    - Non-hierarchical domain
> 
>    - Hierarchical domain v1
> 
>          That's the global PCI/MSI domain
> 
>    - Hierarchical domain v2
> 
>       That's the underlying MSI_PARENT domain, which is on x86
>       either the interrupt remap unit or the vector domain. On arm64
>       it's the ITS domain.
> 
> See e.g. pci_msi_domain_supports() which handles the mess.
> 
> Now this proposal will only work on hierarchical domain v2 because that
> can do the post enable allocations on MSI-X. Let's put a potential
> solution for MSI aside for now to avoid complexity explosion.
> 
> So there are two ways to solve this:
> 
>   1) Implement the CXL MSI parent domain as disjunct domain
> 
>   2) Teach the V2 per device MSI-X domain to be a parent domain
> 
> #1 Looks pretty straight forward, but does not seemlessly fit the
>    hierarchical domain model and creates horrible indirections.
> 
> #2 Is the proper abstraction, but requires to teach the MSI core code
>    about stacked MSI parent domains, which should not be horribly hard
>    or maybe just works out of the box.
> 
> The PCI device driver invokes the not yet existing function
> pci_enable_msi_parent_domain(pci_dev). This function does:
> 
>   A) validate that this PCI device has a V2 parent domain
> 
>   B) validate that the device has enabled MSI-X
> 
>   C) validate that the PCI/MSI-X domain supports dynamic MSI-X
>      allocation
> 
>   D) if #A to #C are true, it enables the PCI device parent domain
> 
> I made #B a prerequisite for now, as that's an orthogonal problem, which
> does not need to be solved upfront. Maybe it does not need to be solved
> at all because the underlying PCI driver always allocates a management
> interrupt before dealing with the underlying "bus", which is IMHO a
> reasonable expectation. At least it's a reasonable restriction for
> getting started.
> 
> That function does:
> 
>      msix_dom = pci_msi_get_msix_domain(pdev);
>      msix_dom->flags |= IRQ_DOMAIN_FLAG_MSI_PARENT;
>      msix_dom->msi_parent_ops = &pci_msix_parent_ops;
> 
> When creating the CXL devices the CXL code invokes
> pci_msi_init_subdevice(pdev, &cxldev->device), which just does:
> 
>   dev_set_msi_domain(dev, pci_msi_get_msix_subdevice_parent(pdev));
> 
> That allows the CXL devices to create their own per device MSI
> domain via a new function pci_msi_create_subdevice_irq_domain().
> 
> That function can just use a variant of pci_create_device_domain() with
> a different domain template and a different irqchip, where the irqchip
> just redirects to the underlying parent domain chip, aka PCI-MSI-X.
> 
> I don't think the CXL device will require a special chip as all they
> should need to know is the linux interrupt number. If there are special
> requirements then we can bring the IMS code back or do something similar
> to the platform MSI code. 
> 
> Then you need pci_subdev_msix_alloc_at() and pci_subdev_msix_free()
> which are the only two functions which the CXL (or similar) need.
> 
> The existing pci_msi*() API function might need a safety check so they
> can't be abused by subdevices, but that's a problem for a final
> solution.
> 
> That's pretty much all what it takes. Hope that helps.

Absolutely!  Thanks for providing the detailed suggestion
this definitely smells a lot less nasty than previous approach.

I have things sort of working now but it's ugly code with a few
cross layer hacks that need tidying up (around programming the
msi registers from wrong 'device'), so may be a little
while before I get it in a state to post.

Jonathan

> 
> Thanks,
> 
>         tglx
> 
> 
> 





[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux