On Thu, May 23, 2024 at 9:24 PM Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx> wrote: > > On Wed, 22 May 2024 20:11:41 +1000 > Alistair Francis <alistair23@xxxxxxxxx> wrote: > > > The PCIe 6 specification added support for the Data Object > > Exchange (DOE). > > When DOE is supported the DOE Discovery Feature must be implemented per > > PCIe r6.1 sec 6.30.1.1. The protocol allows a requester to obtain > > information about the other DOE features supported by the device. > > > > The kernel is already querying the DOE features supported and cacheing > > the values. Expose the values in sysfs to allow user space to > > determine which DOE features are supported by the PCIe device. > > > > By exposing the information to userspace tools like lspci can relay the > > information to users. By listing all of the supported features we can > > allow userspace to parse the list, which might include > > vendor specific features as well as yet to be supported features. > > > > As the DOE Discovery feature must always be supported we treat it as a > > special named attribute case. This allows the usual PCI attribute_group > > handling to correctly create the doe_features directory when registering > > pci_doe_sysfs_group (otherwise it doesn't and sysfs_add_file_to_group() > > will seg fault). > > > > After this patch is supported you can see something like this when > > attaching a DOE device > > > > $ ls /sys/devices/pci0000:00/0000:00:02.0//doe* > > 0001:01 0001:02 doe_discovery > > > > Signed-off-by: Alistair Francis <alistair.francis@xxxxxxx> > > What happens if multiple DOE which support the same protocol? > (IIRC that's allowed). You probably need to paper over repeat > sysfs attributes and make sure they don't get double freed etc. Fair point. I changed pci_doe_sysfs_feature_populate() to not fall over if the entry already exists, we just skip adding it. pci_doe_sysfs_feature_remove() should already handle double entries with the attrs[i].show check. > > Otherwise some minor things inline. > > Jonathan > > > > --- > > v10: > > - Rebase to use DEFINE_SYSFS_GROUP_VISIBLE and remove > > special setup function > > v9: > > - Add a teardown function > > - Rename functions to be clearer > > - Tidy up the commit message > > - Remove #ifdef from header > > v8: > > - Inlucde an example in the docs > > - Fixup removing a file that wasn't added > > - Remove a blank line > > v7: > > - Fixup the #ifdefs to keep the test robot happy > > v6: > > - Use "feature" instead of protocol > > - Don't use any devm_* functions > > - Add two more patches to the series > > v5: > > - Return the file name as the file contents > > - Code cleanups and simplifications > > v4: > > - Fixup typos in the documentation > > - Make it clear that the file names contain the information > > - Small code cleanups > > - Remove most #ifdefs > > - Remove extra NULL assignment > > v3: > > - Expose each DOE feature as a separate file > > v2: > > - Add documentation > > - Code cleanups > > > > Documentation/ABI/testing/sysfs-bus-pci | 28 ++++ > > drivers/pci/doe.c | 175 ++++++++++++++++++++++++ > > drivers/pci/pci-sysfs.c | 13 ++ > > drivers/pci/pci.h | 10 ++ > > 4 files changed, 226 insertions(+) > > > > diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci > > index ecf47559f495..65a3238ab701 100644 > > --- a/Documentation/ABI/testing/sysfs-bus-pci > > +++ b/Documentation/ABI/testing/sysfs-bus-pci > > @@ -500,3 +500,31 @@ Description: > > console drivers from the device. Raw users of pci-sysfs > > resourceN attributes must be terminated prior to resizing. > > Success of the resizing operation is not guaranteed. > > + > > +What: /sys/bus/pci/devices/.../doe_features > > +Date: May 2024 > > +Contact: Linux PCI developers <linux-pci@xxxxxxxxxxxxxxx> > > +Description: > > + This directory contains a list of the supported > > + Data Object Exchange (DOE) features. The features are > > + the file name. The contents of each file is the raw vendor id and > > + data object feature values. > > + > > + The value comes from the device and specifies the vendor and > > + data object type supported. The lower (RHS of the colon) is > > + the data object type in hex. The upper (LHS of the colon) > > + is the vendor ID. > > + > > + As all DOE devices must support the DOE discovery protocol, if > > + DOE is supported you will at least see the doe_discovery file, with > > + this contents > > + > > + # cat doe_features/doe_discovery > > + 0001:00 > > + > > + If the device supports other protocols you will see other files > > + as well. For example is CMA/SPDM and secure CMA/SPDM are supported > > + the doe_features directory will look like this > > + > > + # ls doe_features > > + 0001:01 0001:02 doe_discovery > > diff --git a/drivers/pci/doe.c b/drivers/pci/doe.c > > index defc4be81bd4..7a20a257df5a 100644 > > --- a/drivers/pci/doe.c > > +++ b/drivers/pci/doe.c > > @@ -47,6 +47,7 @@ > > * @wq: Wait queue for work item > > * @work_queue: Queue of pci_doe_work items > > * @flags: Bit array of PCI_DOE_FLAG_* flags > > + * @sysfs_attrs: Array of sysfs device attributes > > */ > > struct pci_doe_mb { > > struct pci_dev *pdev; > > @@ -56,6 +57,10 @@ struct pci_doe_mb { > > wait_queue_head_t wq; > > struct workqueue_struct *work_queue; > > unsigned long flags; > > + > > +#ifdef CONFIG_SYSFS > > + struct device_attribute *sysfs_attrs; > > +#endif > > }; > > > > struct pci_doe_feature { > > @@ -92,6 +97,176 @@ struct pci_doe_task { > > struct pci_doe_mb *doe_mb; > > }; > > > > +#ifdef CONFIG_SYSFS > > +static ssize_t doe_discovery_show(struct device *dev, > > + struct device_attribute *attr, > > + char *buf) > > +{ > > + return sysfs_emit(buf, "0001:00\n"); > > +} > > +DEVICE_ATTR_RO(doe_discovery); > > + > > +static struct attribute *pci_doe_sysfs_feature_attrs[] = { > > + &dev_attr_doe_discovery.attr, > > + NULL, > > No comma needed on the null terminator as we'll never add anything after > it. > > > +}; > > + > > +static umode_t pci_doe_sysfs_attr_visible(struct kobject *kobj, > > + struct attribute *a, int n) > > +{ > > + struct pci_dev *pdev = to_pci_dev(kobj_to_dev(kobj)); > > + struct pci_doe_mb *doe_mb; > > + unsigned long index, j; > > + unsigned long vid, type; > > + void *entry; > > + > > + xa_for_each(&pdev->doe_mbs, index, doe_mb) { > > + xa_for_each(&doe_mb->feats, j, entry) { > > + vid = xa_to_value(entry) >> 8; > > + type = xa_to_value(entry) & 0xFF; > > + > > + if (vid == 0x01 && type == 0x00) { > > + /* This is the DOE discovery protocol > local comment syntax is the > /* > * This is the > form so stick to that. > > > Shouldn't this also return a->mode for any case where the particular attribute > matches? I guess is_visible() isn't called for late registered sysfs attributes > though I think it probably should be! The is_visible is only called for the original doe_features attribute, I will update the names of the functions to make this clear. > > > + * Every DOE instance must support this, so we > > + * give it a useful name. > > + */ > > + return a->mode; > > + } > > + } > > + } > > + > > + return 0; > > +} > > + > > +static bool pci_doe_sysfs_group_visible(struct kobject *kobj) > > +{ > > + struct pci_dev *pdev = to_pci_dev(kobj_to_dev(kobj)); > > + struct pci_doe_mb *doe_mb; > > + unsigned long index, j; > > + void *entry; > > + > > + xa_for_each(&pdev->doe_mbs, index, doe_mb) { > > + xa_for_each(&doe_mb->feats, j, entry) > Is this simpler as > if (!xa_empty(&doe_mb->feats)) > return true; Fine with me > > > + return true; > > + } > > + > > + return false; > > +} > > +DEFINE_SYSFS_GROUP_VISIBLE(pci_doe_sysfs) > > + > > +const struct attribute_group pci_doe_sysfs_group = { > > + .name = "doe_features", > > + .attrs = pci_doe_sysfs_feature_attrs, > > + .is_visible = SYSFS_GROUP_VISIBLE(pci_doe_sysfs), > > +}; > > + > > +static ssize_t pci_doe_sysfs_feature_show(struct device *dev, > > + struct device_attribute *attr, > > + char *buf) > > +{ > > + return sysfs_emit(buf, "%s\n", attr->attr.name); > > +} > > + > > +static void pci_doe_sysfs_feature_remove(struct pci_dev *pdev, > > + struct pci_doe_mb *doe_mb) > > +{ > > + struct device_attribute *attrs = doe_mb->sysfs_attrs; > > + struct device *dev = &pdev->dev; > > + unsigned long i; > > + void *entry; > > + > > + if (!attrs) > > + return; > > + > > + doe_mb->sysfs_attrs = NULL; > > + xa_for_each(&doe_mb->feats, i, entry) { > > I'm not particularly keen on using an index over the xa > just to get the number of elements for the loop limit. > Maybe just store that when you allocate attrs? Is that really any better? Then we have another value to keep track of. Plus this gets trickier if we skip a duplicate entry. > > > + if (attrs[i].show) > > + sysfs_remove_file_from_group(&dev->kobj, &attrs[i].attr, > > + pci_doe_sysfs_group.name); > > + kfree(attrs[i].attr.name); > > + } > > + kfree(attrs); > > +} > > + > > +static int pci_doe_sysfs_feature_populate(struct pci_dev *pdev, > > + struct pci_doe_mb *doe_mb) > > +{ > > + struct device *dev = &pdev->dev; > > + struct device_attribute *attrs; > > + unsigned long num_features = 0; > > + unsigned long vid, type; > > + unsigned long i; > > + void *entry; > > + int ret; > > + > > + xa_for_each(&doe_mb->feats, i, entry) > > + num_features++; > > + > > + attrs = kcalloc(num_features, sizeof(*attrs), GFP_KERNEL); > > + if (!attrs) > > + return -ENOMEM; > > + > > + doe_mb->sysfs_attrs = attrs; > > + xa_for_each(&doe_mb->feats, i, entry) { > > + sysfs_attr_init(&attrs[i].attr); > > + vid = xa_to_value(entry) >> 8; > > + type = xa_to_value(entry) & 0xFF; > > + > > + if (vid == 0x01 && type == 0x00) { > > + // DOE Discovery, manually displayed by `dev_attr_doe_discovery` > > /* */ syntax. > > > + continue; > > + } > > + > > + attrs[i].attr.name = kasprintf(GFP_KERNEL, > > + "%04lx:%02lx", vid, type); > > + if (!attrs[i].attr.name) { > > + ret = -ENOMEM; > > + goto fail; > > + } > > + > > + attrs[i].attr.mode = 0444; > > + attrs[i].show = pci_doe_sysfs_feature_show; > > + > > + ret = sysfs_add_file_to_group(&dev->kobj, &attrs[i].attr, > > + pci_doe_sysfs_group.name); > > + if (ret) { > > + attrs[i].show = NULL; > > + goto fail; > > Repeated DOE 'features' on different DOE instances may cause this to fail. We just skip that case then Alistair > > > > + } > > + } > > + > > + return 0; > > + > > +fail: > > + pci_doe_sysfs_feature_remove(pdev, doe_mb); > > + return ret; > > +} > > > static int pci_doe_wait(struct pci_doe_mb *doe_mb, unsigned long timeout) > > { > > if (wait_event_timeout(doe_mb->wq, > > diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c > > index 40cfa716392f..b5db191cb29f 100644 > > --- a/drivers/pci/pci-sysfs.c > > +++ b/drivers/pci/pci-sysfs.c > > @@ -16,6 +16,7 @@ > > #include <linux/kernel.h> > > #include <linux/sched.h> > > #include <linux/pci.h> > > +#include <linux/pci-doe.h> > > #include <linux/stat.h> > > #include <linux/export.h> > > #include <linux/topology.h> > > @@ -1143,6 +1144,9 @@ static void pci_remove_resource_files(struct pci_dev *pdev) > > { > > int i; > > > > + if (IS_ENABLED(CONFIG_PCI_DOE)) > > + pci_doe_sysfs_teardown(pdev); > > + > > for (i = 0; i < PCI_STD_NUM_BARS; i++) { > > struct bin_attribute *res_attr; > > > > @@ -1227,6 +1231,12 @@ static int pci_create_resource_files(struct pci_dev *pdev) > > int i; > > int retval; > > > > + if (IS_ENABLED(CONFIG_PCI_DOE)) { > > + retval = pci_doe_sysfs_init(pdev); > > + if (retval) > > + return retval; > > + } > > + > > /* Expose the PCI resources from this device as files */ > > for (i = 0; i < PCI_STD_NUM_BARS; i++) { > > > > @@ -1661,6 +1671,9 @@ const struct attribute_group *pci_dev_attr_groups[] = { > > #endif > > #ifdef CONFIG_PCIEASPM > > &aspm_ctrl_attr_group, > > +#endif > > +#ifdef CONFIG_PCI_DOE > > + &pci_doe_sysfs_group, > > #endif > > NULL, > > }; > >