Re: [RFC PATCH 0/9] pci: portdrv: Add auxiliary bus and register CXL PMUs (and aer)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Sep 05 2024 at 12:23, Jonathan Cameron wrote:
>> Look at how the PCI device manages interrupts with the per device MSI
>> mechanism. Forget doing this with either one of the legacy mechanisms.
>> 
>>   1) It creates a hierarchical interrupt domain and gets the required
>>      resources from the provided parent domain. The PCI side does not
>>      care whether this is x86 or arm64 and it neither cares whether the
>>      parent domain does remapping or not. The only way it cares is about
>>      the features supported by the different parent domains (think
>>      multi-MSI).
>>      
>>   2) Driver side allocations go through the per device domain
>> 
>> That AUXbus is not any different. When the CPMU driver binds it wants to
>> allocate interrupts. So instead of having a convoluted construct
>> reaching into the parent PCI device, you simply can do:
>> 
>>   1) Let the cxl_pci driver create a MSI parent domain and set that in
>>      the subdevice::msi::irqdomain pointer.
>> 
>>   2) Provide cxl_aux_bus_create_irqdomain() which allows the CPMU device
>>      driver to create a per device interrupt domain.
>> 
>>   3) Let the CPMU driver create its per device interrupt domain with
>>      the provided parent domain
>> 
>>   4) Let the CPMU driver allocate its MSI-X interrupts through the per
>>      device domain
>> 
>> Now on the cxl_pci side the AUXbus parent interrupt domain allocation
>> function does:
>> 
>>     if (!pci_msix_enabled(pdev))
>>     	return pci_msix_enable_and_alloc_irq_at(pdev, ....);
>>     else
>>         return pci_msix_alloc_irq_at(pdev, ....);

Ignore the above. Brainfart.

> I'm struggling to follow this suggestion
> Would you expect the cxl_pci MSI parent domain to set it's parent as
> msi_domain = irq_domain_create_hierarchy(dev_get_msi_domain(&pdev->dev),
> 					 IRQ_DOMAIN_FLAG_MSI_PARENT,
> 					 ...
> which seems to cause a problem with deadlocks in __irq_domain_alloc_irqs()
> or create a separate domain structure and provide callbacks that reach into
> the parent domain as necessary?
>
> Or do I have this entirely wrong? I'm struggling to relate what existing
> code like PCI does to what I need to do here.

dev_get_msi_domain(&pdev->dev) is a nightmare due to the 4 different
models we have:

   - Legacy has no domain

   - Non-hierarchical domain

   - Hierarchical domain v1

         That's the global PCI/MSI domain

   - Hierarchical domain v2

      That's the underlying MSI_PARENT domain, which is on x86
      either the interrupt remap unit or the vector domain. On arm64
      it's the ITS domain.

See e.g. pci_msi_domain_supports() which handles the mess.

Now this proposal will only work on hierarchical domain v2 because that
can do the post enable allocations on MSI-X. Let's put a potential
solution for MSI aside for now to avoid complexity explosion.

So there are two ways to solve this:

  1) Implement the CXL MSI parent domain as disjunct domain

  2) Teach the V2 per device MSI-X domain to be a parent domain

#1 Looks pretty straight forward, but does not seemlessly fit the
   hierarchical domain model and creates horrible indirections.

#2 Is the proper abstraction, but requires to teach the MSI core code
   about stacked MSI parent domains, which should not be horribly hard
   or maybe just works out of the box.

The PCI device driver invokes the not yet existing function
pci_enable_msi_parent_domain(pci_dev). This function does:

  A) validate that this PCI device has a V2 parent domain

  B) validate that the device has enabled MSI-X

  C) validate that the PCI/MSI-X domain supports dynamic MSI-X
     allocation

  D) if #A to #C are true, it enables the PCI device parent domain

I made #B a prerequisite for now, as that's an orthogonal problem, which
does not need to be solved upfront. Maybe it does not need to be solved
at all because the underlying PCI driver always allocates a management
interrupt before dealing with the underlying "bus", which is IMHO a
reasonable expectation. At least it's a reasonable restriction for
getting started.

That function does:

     msix_dom = pci_msi_get_msix_domain(pdev);
     msix_dom->flags |= IRQ_DOMAIN_FLAG_MSI_PARENT;
     msix_dom->msi_parent_ops = &pci_msix_parent_ops;

When creating the CXL devices the CXL code invokes
pci_msi_init_subdevice(pdev, &cxldev->device), which just does:

  dev_set_msi_domain(dev, pci_msi_get_msix_subdevice_parent(pdev));

That allows the CXL devices to create their own per device MSI
domain via a new function pci_msi_create_subdevice_irq_domain().

That function can just use a variant of pci_create_device_domain() with
a different domain template and a different irqchip, where the irqchip
just redirects to the underlying parent domain chip, aka PCI-MSI-X.

I don't think the CXL device will require a special chip as all they
should need to know is the linux interrupt number. If there are special
requirements then we can bring the IMS code back or do something similar
to the platform MSI code. 

Then you need pci_subdev_msix_alloc_at() and pci_subdev_msix_free()
which are the only two functions which the CXL (or similar) need.

The existing pci_msi*() API function might need a safety check so they
can't be abused by subdevices, but that's a problem for a final
solution.

That's pretty much all what it takes. Hope that helps.

Thanks,

        tglx






[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux