[+cc Christoph, Thomas for INTx/MSI/bus mastering question below] On Wed, Nov 17, 2021 at 12:23:35PM +0000, Jonathan Cameron wrote: > On Tue, 16 Nov 2021 17:48:29 -0600 > Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote: > > On Fri, Nov 05, 2021 at 04:50:54PM -0700, ira.weiny@xxxxxxxxx wrote: > > > From: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx> > > > > > > CXL devices have DOE mailboxes. Create auxiliary devices which can be > > > driven by the generic DOE auxiliary driver. > > Based on the ECN, it sounds like any PCI device can have DOE > > capabilities, so I suspect the support for it should be in > > drivers/pci/, not drivers/cxl/. I don't really see anything > > CXL-specific below. > > Agreed though how it all gets tied together isn't totally clear > to me yet. The messy bit is interrupts given I don't think we have > a model for enabling those anywhere other than in individual PCI drivers. Ah. Yeah, that is a little messy. The only real precedent where the PCI core and a driver might need to coordinate on interrupts is the portdrv. So far we've pretended that bridges do not have device-specific functionality that might require interrupts. I don't think that's actually true, but we haven't integrated drivers for the tuning, performance monitoring, and similar features that bridges may have. Yet. In any case, I think the argument that DOE capabilities are not CXL-specific still holds. > > What do these DOE capabilities look like in lspci? I don't see any > > support in the current version (which looks like it's a year old). > > I don't think anyone has added support yet, but it would be simple to do. > Given possibility of breaking things if we actually exercise the discovery > protocol, we'll be constrained to just reporting there is a DOE instances > which is of limited use. I think it's essential that lspci at least show the existence of DOE capabilities and the safe-to-read registers (Capabilities, Control, Status). There's a very long lead time between adding the support and getting updated versions of lspci into distros. > > > + * An implementation of a cxl type3 device may support an unknown > > > + * number of interrupts. Assume that number is not that large and > > > + * request them all. > > > + */ > > > + irqs = pci_msix_vec_count(pdev); > > > + rc = pci_alloc_irq_vectors(pdev, irqs, irqs, PCI_IRQ_MSIX); > > > + if (rc != irqs) { > > > + /* No interrupt available - carry on */ > > > + dev_dbg(dev, "No interrupts available for DOE\n"); > > > + } else { > > > + /* > > > + * Enabling bus mastering could be done within the DOE > > > + * initialization, but as it potentially has other impacts > > > + * keep it within the driver. > > > + */ > > > + pci_set_master(pdev); > > > > This enables the device to perform DMA, which doesn't seem to have > > anything to do with the rest of this code. Can it go somewhere > > near something to do with DMA? > > Needed for MSI/MSIx as well. The driver doesn't do DMA for anything > else. Hence it's here in the interrupt enable path. Oh, right, of course. A hint here that MSI/MSI-X depends on bus mastering would save me the trouble. I wonder if the infrastructure, e.g., something inside pci_alloc_irq_vectors_affinity() should do this for us. The connection is "obvious" but not mentioned in Documentation/PCI/msi-howto.rst and I'm not sure how callers that supply PCI_IRQ_ALL_TYPES would know whether they got a single MSI vector (which requires bus mastering) or an INTx vector (which does not). > > So we get an auxiliary device for every instance of a DOE > > capability? I think the commit log should mention something about > > how many are created (e.g., "one per DOE capability"), how they > > are named, whether they appear in sysfs, how drivers bind to them, > > etc. > > > > I assume there needs to be some coordination between possible > > multiple users of a DOE capability? How does that work? > > The DOE handling implementation makes everything synchronous - so if > multiple users each may have to wait on queueing their query / > responses exchanges. > > The fun of non OS software accessing these is still an open > question. Sounds like something that potentially could be wrapped up in a safe but slow interface that could be usable by others, including lspci? Bjorn