Jonathan Cameron wrote: > On Tue, 30 Jan 2024 01:24:14 -0800 > Dan Williams <dan.j.williams@xxxxxxxxx> wrote: > > > The PCIe 6.1 specification, section 11, introduces the Trusted > > Execution Environment (TEE) Device Interface Security Protocol (TDISP). > > This interface definition builds upon CMA, component measurement and > > authentication, and IDE, link integrity and data encryption. It adds > > support for establishing virtual functions within a device that can be > > assigned to a confidential VM such that the assigned device is enabled > > to access guest private memory protected by technologies like Intel TDX, > > AMD SEV-SNP, RISCV COVE, or ARM CCA. > > > > The "TSM" (TEE Security Manager) is a concept in the TDISP specification > > of an agent that mediates between a device security manager (DSM) and > > system software in both a VMM and a VM. From a Linux perspective the TSM > > abstracts many of the details of TDISP, IDE, and CMA. Some of those > > details leak through at times, but for the most part TDISP is an > > internal implementation detail of the TSM. > > > > Similar to the PCI core extensions to support CONFIG_PCI_CMA, > > CONFIG_PCI_TSM builds upon that to reuse the "authenticated" sysfs > > attribute, and add more properties + controls in a tsm/ subdirectory of > > the PCI device sysfs interface. Unlike CMA that can depend on a local to > > the PCI core implementation, PCI_TSM needs to be prepared for late > > loading of the platform TSM driver. Consider that the TSM driver may > > itself be a PCI driver. Userspace can depend on the common TSM device > > uevent to know when the PCI core has TSM services enabled. The PCI > > device tsm/ subdirectory is supplemented by the TSM device pci/ > > directory for platform global TSM properties + controls. > > > > All vendor TSM implementations share the property of asking the VMM to > > perform DOE mailbox operations on behalf of the TSM. That common > > capability is centralized in PCI core code that invokes an ->exec() > > operation callback potentially multiple times to service a given request > > (struct pci_tsm_req). Future operations / verbs will be handled > > similarly with the "request + exec" model. For now, only "connect" and > > "disconnect" are implemented which at a minimum is expected to establish > > IDE for the link. > > > > In addition to requests the low-level TSM implementation is notified of > > device arrival and departure events so that it can filter devices that > > the TSM is not prepared to support, or otherwise setup and teardown > > per-device context. > > > > Cc: Wu Hao <hao.wu@xxxxxxxxx> > > Cc: Yilun Xu <yilun.xu@xxxxxxxxx> > > Cc: Lukas Wunner <lukas@xxxxxxxxx> > > Cc: Samuel Ortiz <sameo@xxxxxxxxxxxx> > > Cc: Alexey Kardashevskiy <aik@xxxxxxx> > > Cc: Bjorn Helgaas <bhelgaas@xxxxxxxxxx> > > Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx> > > Superficial comments inline - noticed whilst getting my head > around the basic structure. > > > > > > diff --git a/drivers/pci/tsm.c b/drivers/pci/tsm.c > > new file mode 100644 > > index 000000000000..f74de0ee49a0 > > --- /dev/null > > +++ b/drivers/pci/tsm.c > > @@ -0,0 +1,346 @@ > > > + > > +DEFINE_FREE(req_free, struct pci_tsm_req *, if (_T) tsm_ops->req_free(_T)) > > + > > +static int pci_tsm_disconnect(struct pci_dev *pdev) > > +{ > > + struct pci_tsm_req *req __free(req_free) = NULL; > > As below. I'll stop commenting on these. Hey, they are fair game, will find a way to rework this and not use the confusing pattern. > > + > > + /* opportunistic state checks to skip allocating a request */ > > + if (pdev->tsm->state < PCI_TSM_CONNECT) > > + return 0; > > + > > + req = tsm_ops->req_alloc(pdev, PCI_TSM_OP_DISCONNECT); > > + if (!req) > > + return -ENOMEM; > > + > > + scoped_cond_guard(mutex_intr, return -EINTR, tsm_ops->lock) { > > + enum pci_tsm_op_status status; > > + > > + /* revalidate state */ > > + if (pdev->tsm->state < PCI_TSM_CONNECT) > > + return 0; > > + if (pdev->tsm->state < PCI_TSM_INIT) > > + return -ENXIO; > > + > > + do { > > + status = tsm_ops->exec(pdev, req); > > + req->seq++; > > + /* TODO: marshal SPDM request */ > > + } while (status == PCI_TSM_SPDM_REQ); > > + > > + if (status == PCI_TSM_FAIL) > > + return -EIO; > > + pdev->tsm->state = PCI_TSM_INIT; > > + } > > + return 0; > > +} > > + > > +static int pci_tsm_connect(struct pci_dev *pdev) > > +{ > > + struct pci_tsm_req *req __free(req_free) = NULL; > > Push down a few lines to put it where the allocation happens. > > > + > > + /* opportunistic state checks to skip allocating a request */ > > + if (pdev->tsm->state >= PCI_TSM_CONNECT) > > + return 0; > > + > > + req = tsm_ops->req_alloc(pdev, PCI_TSM_OP_CONNECT); > > + if (!req) > > + return -ENOMEM; > > + > > + scoped_cond_guard(mutex_intr, return -EINTR, tsm_ops->lock) { > > What could possibly go wrong? Everyone loves scoped_cond_guard ;) Indeed. > > > + enum pci_tsm_op_status status; > > + > > + /* revalidate state */ > > + if (pdev->tsm->state >= PCI_TSM_CONNECT) > > + return 0; > > + if (pdev->tsm->state < PCI_TSM_INIT) > > + return -ENXIO; > > + > > + do { > > + status = tsm_ops->exec(pdev, req); > > + req->seq++; > > + } while (status == PCI_TSM_SPDM_REQ); > > + > > + if (status == PCI_TSM_FAIL) > > + return -EIO; > > + pdev->tsm->state = PCI_TSM_CONNECT; > > + } > > + return 0; > > +} > > ... > > > + size_t connect_mode_store(struct device *dev, > > + struct device_attribute *attr, > > + const char *buf, size_t len) > > +{ > > + struct pci_dev *pdev = to_pci_dev(dev); > > + int i; > > + > > + guard(mutex)(tsm_ops->lock); > > + if (pdev->tsm->state >= PCI_TSM_CONNECT) > > + return -EBUSY; > > + for (i = 0; i < ARRAY_SIZE(pci_tsm_modes); i++) > > + if (sysfs_streq(buf, pci_tsm_modes[i])) > > + break; > > + if (i == PCI_TSM_MODE_LINK) { > > + if (pdev->tsm->link_capable) > > + pdev->tsm->mode = PCI_TSM_MODE_LINK; > > Error in all real paths paths? Uh... yeah, did I mention that this untested. Will fix. > Also, maybe a switch will be more sensible here. Sure. > > > + return -EOPNOTSUPP; > > + } else if (i == PCI_TSM_MODE_SELECTIVE) { > > + if (pdev->tsm->selective_capable) > > + pdev->tsm->mode = PCI_TSM_MODE_SELECTIVE; > > + return -EOPNOTSUPP; > > + } else > > + return -EINVAL; > > + return len; > > +} > > > > + > > +void pci_tsm_init(struct pci_dev *pdev) > > +{ > > + u16 ide_cap; > > + int rc; > > + > > + if (!pdev->cma_capable) > > + return; > > + > > + ide_cap = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_IDE); > > + if (!ide_cap) > > + return; > > + > > + struct pci_tsm *tsm __free(kfree) = kzalloc(sizeof(*tsm), GFP_KERNEL); > > + if (!tsm) > > + return; > > + > > + tsm->ide_cap = ide_cap; > > + > > + rc = xa_insert(&pci_tsm_devs, (unsigned long)pdev, pdev, GFP_KERNEL); > > This is an unusual pattern with the key and the value matching. > Feels like xarray might for once not be the best choice of structure. This is the "use xarray instead of a linked list patterni". It would be useful to maybe make the key be the Segment+BDF, but I did not take the time to figure out if that can fit in an unsigned long. In the meantime this saves needing to embed a linked list node in 'struct pci_dev'. [..] > > diff --git a/drivers/virt/coco/tsm/class.c b/drivers/virt/coco/tsm/class.c > > index a569fa6b09eb..a459e51c0892 100644 > > --- a/drivers/virt/coco/tsm/class.c > > +++ b/drivers/virt/coco/tsm/class.c > > > +static const struct attribute_group *tsm_attr_groups[] = { > > +#ifdef CONFIG_PCI_TSM > > + &tsm_pci_attr_group, > > +#endif > > + NULL, > Trivial but, no point in that comma as we will never chase it with anything. > > +}; > > + > > static int __init tsm_init(void) > > { > > tsm_class = class_create("tsm"); > > @@ -86,6 +97,7 @@ static int __init tsm_init(void) > > return PTR_ERR(tsm_class); > > > > tsm_class->dev_release = tsm_release; > > + tsm_class->dev_groups = tsm_attr_groups; > > return 0; > > } > > module_init(tsm_init) > > diff --git a/drivers/virt/coco/tsm/pci.c b/drivers/virt/coco/tsm/pci.c > > new file mode 100644 > > index 000000000000..b3684ad7114f > > --- /dev/null > > +++ b/drivers/virt/coco/tsm/pci.c > > > + > > +static bool tsm_pci_group_visible(struct kobject *kobj) > > +{ > > + struct device *dev = kobj_to_dev(kobj); > > + struct tsm_subsys *subsys = container_of(dev, typeof(*subsys), dev); > Give this a macro probably. > dev_to_tsm_subsys(kobj_to_dev(kobj)); Sure. > > > + > > + if (subsys->info->pci_ops) > > + return true; > > return subsys->info->pci->ops; > maybe True, maybe with a !! for good measure. [..] > > diff --git a/drivers/virt/coco/tsm/tsm.h b/drivers/virt/coco/tsm/tsm.h > > new file mode 100644 > > index 000000000000..407c388a109b > > --- /dev/null > > +++ b/drivers/virt/coco/tsm/tsm.h > > @@ -0,0 +1,28 @@ > > +/* SPDX-License-Identifier: GPL-2.0 */ > > +#ifndef __TSM_CORE_H > > +#define __TSM_CORE_H > > + > > +#include <linux/device.h> > > + > > +struct tsm_info; > > +struct tsm_subsys { > > + struct device dev; > > + const struct tsm_info *info; > > +}; > > I know people like to build up patch sets, but I think defining this here > from patch 3 would just reduce noise. Ok. > > > + > > +#ifdef CONFIG_PCI_TSM > > +int tsm_pci_init(const struct tsm_info *info); > > +void tsm_pci_destroy(const struct tsm_info *info); > > +extern const struct attribute_group tsm_pci_attr_group; > > +#else > > +static inline int tsm_pci_init(const struct tsm_info *info) > > +{ > > + return 0; > > +} > > +static inline void tsm_pci_destroy(const struct tsm_info *info) > > +{ > > +} > > +#endif > > + > > +#endif /* TSM_CORE_H */ > > > diff --git a/include/linux/tsm.h b/include/linux/tsm.h > > index 8cb8a661ba41..f5dbdfa65d8d 100644 > > --- a/include/linux/tsm.h > > +++ b/include/linux/tsm.h > > @@ -4,11 +4,15 @@ > > > > #include <linux/sizes.h> > > #include <linux/types.h> > > +#include <linux/mutex.h> > > struct mutex; > instead given you only have a pointer I think. True, but see below I expect this lock to move somewhere else in the next version. > > > > > > struct tsm_info { > > const char *name; > > struct device *host; > > const struct attribute_group **groups; > > + const struct tsm_pci_ops *pci_ops; > > + unsigned int nr_selective_streams; > > + unsigned int link_stream_capable:1; > > }; > > > > +struct pci_dev; > > +/** > > + * struct tsm_pci_ops - Low-level TSM-exported interface to the PCI core > > + * @add: accept device for tsm operation, locked > > + * @del: teardown tsm context for @pdev, locked > > + * @req_alloc: setup context for given operation, unlocked > > + * @req_free: teardown context for given request, unlocked > > + * @exec: run @req, may be invoked multiple times per @req, locked > > + * @lock: tsm work is one device and one op at a time > > + */ > > +struct tsm_pci_ops { > > + int (*add)(struct pci_dev *pdev); > > + void (*del)(struct pci_dev *pdev); > > + struct pci_tsm_req *(*req_alloc)(struct pci_dev *pdev, > > + enum pci_tsm_op op); > > + struct pci_tsm_req *(*req_free)(struct pci_tsm_req *req); > > + enum pci_tsm_op_status (*exec)(struct pci_dev *pdev, > > + struct pci_tsm_req *req); > > + struct mutex *lock; > > Mixing data with what would otherwise be likely to be constant pointers > probably best avoided. Maybe wrap the lock in another op instead? After chatting with Alexey this lock is too coarse and will move to a per device lock rather than locking the entire interface.