On Fri, Oct 13, 2023 at 01:41:57PM +1000, Alistair Francis wrote: > The PCIe 6 specification added support for the Data Object Exchange (DOE). > When DOE is supported the DOE Discovery Feature must be implemented per > PCIe r6.1 sec 6.30.1.1. The protocol allows a requester to obtain > information about the other DOE features supported by the device. > ... > +static umode_t pci_doe_sysfs_attr_is_visible(struct kobject *kobj, > + struct attribute *a, int n) > +{ > + struct pci_dev *pdev = to_pci_dev(kobj_to_dev(kobj)); > + struct pci_doe_mb *doe_mb; > + unsigned long index, j; > + void *entry; > + > + xa_for_each(&pdev->doe_mbs, index, doe_mb) { > + xa_for_each(&doe_mb->feats, j, entry) > + return a->mode; > + } > + > + return 0; The nested loops that don't test anything look a little weird and maybe I'm missing something, but this looks like it returns a->mode if any mailbox with a feature exists, and 0 otherwise. Is that the same as this: if (pdev->doe_mbs) return a->mode; return 0; since it sounds like a mailbox must support at least one feature? > +} > + > +static struct attribute *pci_dev_doe_feature_attrs[] = { > + NULL, > +}; > + > +const struct attribute_group pci_dev_doe_feature_group = { > + .name = "doe_features", > + .attrs = pci_dev_doe_feature_attrs, > + .is_visible = pci_doe_sysfs_attr_is_visible, > +}; > + > +static ssize_t pci_doe_sysfs_feature_show(struct device *dev, > + struct device_attribute *attr, > + char *buf) > +{ > + return sysfs_emit(buf, "%s\n", attr->attr.name); > +} > + > +static void pci_doe_sysfs_feature_remove(struct pci_dev *pdev, > + struct pci_doe_mb *doe_mb) > +{ > + struct device *dev = &pdev->dev; > + struct device_attribute *attrs = doe_mb->sysfs_attrs; > + unsigned long i; > + void *entry; > + > + if (!attrs) > + return; > + > + doe_mb->sysfs_attrs = NULL; > + xa_for_each(&doe_mb->feats, i, entry) { > + if (attrs[i].show) > + sysfs_remove_file_from_group(&dev->kobj, &attrs[i].attr, > + pci_dev_doe_feature_group.name); > + kfree(attrs[i].attr.name); > + } > + kfree(attrs); > +} > + > +static int pci_doe_sysfs_feature_populate(struct pci_dev *pdev, > + struct pci_doe_mb *doe_mb) > +{ > + struct device *dev = &pdev->dev; > + struct device_attribute *attrs; > + unsigned long num_features = 0; > + unsigned long vid, type; > + unsigned long i; > + void *entry; > + int ret; > + > + xa_for_each(&doe_mb->feats, i, entry) > + num_features++; > + > + attrs = kcalloc(num_features, sizeof(*attrs), GFP_KERNEL); > + if (!attrs) > + return -ENOMEM; > + > + doe_mb->sysfs_attrs = attrs; > + xa_for_each(&doe_mb->feats, i, entry) { > + sysfs_attr_init(&attrs[i].attr); > + vid = xa_to_value(entry) >> 8; > + type = xa_to_value(entry) & 0xFF; > + attrs[i].attr.name = kasprintf(GFP_KERNEL, > + "0x%04lX:%02lX", vid, type); What's the rationale for using "0x" on the vendor ID but not on the type? "0x1234:10" hints that the "10" might be decimal since it lacks "0x". Suggest lower-case "%04lx:%02lx" either way. FWIW, there's no "0x" prefix on the hex vendor IDs in "lspci -n" output and dmesg messages like this: pci 0000:01:00.0: [10de:13b6] type 00 > + if (!attrs[i].attr.name) { > + ret = -ENOMEM; > + goto fail; > + } > + > + attrs[i].attr.mode = 0444; > + attrs[i].show = pci_doe_sysfs_feature_show; > + > + ret = sysfs_add_file_to_group(&dev->kobj, &attrs[i].attr, > + pci_dev_doe_feature_group.name); > + if (ret) { > + attrs[i].show = NULL; > + goto fail; > + } > + } > + > + return 0; > + > +fail: > + pci_doe_sysfs_feature_remove(pdev, doe_mb); > + return ret; Not sure we should treat this failure this seriously. Looks like it will prevent creation of the BAR resource files, and possibly even abort pci_sysfs_init() early. I think the pci_dev will still be usable (lacking DOE sysfs) even if this fails. > +} > + > +void pci_doe_sysfs_teardown(struct pci_dev *pdev) > +{ > + struct pci_doe_mb *doe_mb; > + unsigned long index; > + > + xa_for_each(&pdev->doe_mbs, index, doe_mb) { > + pci_doe_sysfs_feature_remove(pdev, doe_mb); > + } > +} > + > +int pci_doe_sysfs_init(struct pci_dev *pdev) > +{ > + struct pci_doe_mb *doe_mb; > + unsigned long index; > + int ret; > + > + xa_for_each(&pdev->doe_mbs, index, doe_mb) { > + ret = pci_doe_sysfs_feature_populate(pdev, doe_mb); > + if (ret) > + return ret; > + } I agree with Lukas that pci_create_resource_files() is not the right place to call this. I try hard to avoid calling *anything* from the pci_create_sysfs_dev_files() path because it has the nasty "sysfs_initialized" check and the associated pci_sysfs_init() initcall. It's so much cleaner when we can set up static attributes that are automatically added in the device_add() path. I don't know whether that's possible. I see lots of discussion with Greg KH that might be related, but I'm not sure. I do know that we enumerate the mailboxes and features during pci_init_capabilities(), which happens before device_add(), so the information about which attributes should be present is at least *available* early enough: pci_host_probe pci_scan_root_bus_bridge ... pci_scan_single_device pci_device_add pci_init_capabilities pci_doe_init while (pci_find_next_ext_capability(PCI_EXT_CAP_ID_DOE)) pci_doe_create_mb pci_doe_cache_features pci_doe_discovery xa_insert(&doe_mb->feats) <-- device_add device_add_attrs device_add_groups pci_bus_add_devices pci_bus_add_device pci_create_sysfs_dev_files ... pci_doe_sysfs_init <-- xa_for_each(&pdev->doe_mbs) pci_doe_sysfs_feature_populate xa_for_each(&doe_mb->feats) sysfs_add_file_to_group(pci_dev_doe_feature_group.name) Is it feasible to build an attribute group in pci_doe_init() and add it to dev->groups so device_add() will automatically add them? It looks like __power_supply_register() does something like that: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/power/supply/power_supply_core.c?id=v6.5#n1375 > --- a/include/linux/pci-doe.h > +++ b/include/linux/pci-doe.h > @@ -22,4 +22,6 @@ int pci_doe(struct pci_doe_mb *doe_mb, u16 vendor, u8 type, > const void *request, size_t request_sz, > void *response, size_t response_sz); > > +int pci_doe_sysfs_init(struct pci_dev *pci_dev); > +void pci_doe_sysfs_teardown(struct pci_dev *pdev); These declarations look like they should be in drivers/pci/pci.h as well. I don't think anything outside the PCI core should need these. Bjorn