[ add Lukas and Chuck ] On Tue, May 3, 2022 at 8:35 AM Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx> wrote: > > So far, we have been considering Data Object Exchange (DOE) mailboxes only > on EPs (CXL type 3 devices). > CXL CDAT (technically CXL Table Query Protocol but lets just call it CDAT) > https://lore.kernel.org/linux-cxl/20220414203237.2198665-1-ira.weiny@xxxxxxxxx > CMA/SPDM support > https://lore.kernel.org/linux-cxl/20220303135905.10420-1-Jonathan.Cameron@xxxxxxxxxx/ > > However, a number of DOE protocols apply to switch (and root) ports. > DOE instances supporting CDAT occur on switch upstream ports as well as EPs. > > DOE instances supporting CMA may occur in root ports, upstream switch ports, > downstream switch ports and EPs (including multiple functions where relevant). > So, like you, I was envisioning all the CMA and SPDM code landing in the kernel until I read this: "Extending in-kernel TLS support" https://lwn.net/Articles/892216/ ...and questioned why this new CMA/SPDM session establishment, which is similar to TLS, be done inside the kernel while TLS session establishment is done in userspace? I had a chance to chat with Chuck at LSF/MM and confirmed there is little appetite to change this up-call requirement for session establishment and expect CMA to be the same. The rough idea of how this works with CMA/SPDM is providing an ABI to retrieve session setup data with the end result of userspace instantiating a keyid via keyctl the to be used for future SPDM messages. > The intent of this RFC is to discuss how to actually implement such support. > The attached patch is a really rough PoC for CDAT on upstream switch ports > done by adding a new pcie_port_service_driver. This is different from the > proposed auxiliary device used for CXL type 3 devices (for now). CDAT to me is on the "CXL" side of a given PCI device. Given that endpoints and switches are each represented by cxl_port objects it seems those should generically carry the CDAT binary attribute, not the PCI device, don't you think? > > So open questions: > > 1. Granularity. Should we do a driver per group of protocols that may > be collocated, or one per DOE instance. For now, we might be looking > at CDAT as done for this PoC, and CMA/IDE. The more time goes by the more I am coming around to Bjorn's initial reaction to all this that DOE is closer to a VPD model than an auxiliary_device model or pcie_port_device model. I.e. have some common discovery in the PCI core for enumerating DOE instances and advertisting protocols, but otherwise leave it up to individual leaf drivers like cxl_pci or cxl_port to use that core to run a given protocol. > 2. Use of a pcie_port_service_driver a reasonable way to do this? > 3. Service provision. It is likely that all of the protocols defined > above will be used as part of activities that span multiple devices. > a) CDAT used to establish latencies and bandwidth between host CPU > and memory on a CXL type 3 device beyond one or more CXL switches. > b) CMA. Might just be used to provide simple device attestation > and potentially lock out the upstream port above a switch if the > switch does not pass attestation. Many many other uses possible... Per above once userspace has installed an SPDM session keyid for a given PCI device it can also optionally set an 'authorized' attribute (similiar to what USB and Thunderbolt have) to indicate whether a device has passed attestation. As for the actual protocols that are going to run over the SPDM session those would need their own drivers that reference the established keyid. > c) Secure CMA / IDE. Likely to be used to set up link IDE. What > this will look like is a question I've not really started > thinking about yet. > > So how do we support this? If nothing else we need to make sure > the drivers for the port don't go away whilst in use. Another reason to make it a core aspect of the PCI device like VPD so there are no entanglements beyond "PCI device exists". > The patch is a very early PoC just to show it would 'work'... > > Note I am keen to not have the discussion around this support delay > Ira's series. Is there a nearer term forcing function for this? I.e. v5.20 seems to be where the current DOE series is going to intercept. I think abandon the "aux" organization for now and make DOE like VPD.