On Wed, Aug 14, 2024 at 04:49:30PM -0500, Bjorn Helgaas wrote: > On Wed, Aug 14, 2024 at 02:28:59PM +0200, Mariusz Tkaczyk wrote: > > + /* > > + * Use lazy loading for active_indications to not play with initcalls. > > + * It is needed to allow _DSM initialization on DELL platforms, where > > + * ACPI_IPMI must be loaded first. > > + */ > > + unsigned int active_inds_initialized:1; > > What's going on here? I hope we can at least move this to the _DSM > patch since it seems related to that, not to the NPEM capability. I > don't understand the initcall reference or what "lazy loading" means. In previous iterations of this series, the status of all LEDs was read on PCI device enumeration. That was done so that when user space reads the brightness is sysfs, it gets the correct value. The value is cached, it's not re-read from the register on every brightness read. (It's not guaranteed that all LEDs are off on enumeration. E.g. boot firmware may have fiddled with them, or the enclosure itself may have turned some of them on by itself, typically the "ok" LED.) However Stuart reported issues when the _DSM interface is used on Dell servers, because the _DSM requires IPMI drivers to access the NPEM registers. He got a ton of errors when LED status was read on enumeration because that was simply too early. Start of thread: https://lore.kernel.org/all/05455f36-7027-4fd6-8af7-4fe8e483f25c@xxxxxxxxx/ The solution is to read LED status lazily, when brightness is read or written for the first time through sysfs. At that point, IPMI drivers are typically loaded. Stuart reported success with this approach. There is still a possibility that users may see issues if they access brightness before IPMI drivers are loaded. Those drivers may be modules and user space might overzealously try to access brightness before they're loaded. Or user space may prevent them from loading by blacklisting or not installing them. In which case users get to keep the pieces. We discussed various alternative approaches in the above-linked thread but concluded that this pragmatic solution is the simplest that does the job for all but the most pathological use cases. We wanted to make this work on Dell servers, but at the same time minimize the contortions that we need to go through to accommodate their quirky implementation. The code uses lazy initialization of LED status even in the native NPEM case because it would make the code more complex to use early initialization for direct NPEM register access and lazy initialization for _DSM-mediated register access. > Is there some existing ACPI ordering that guarantees ACPI_IPMI happens > first? Why do we need some Dell-specific thing here? > > What is ACPI_IPMI? I guess it refers to the "acpi_ipmi" module, > acpi_ipmi.c? As it turned out in the above-linked thread, just forcing ACPI_IPMI=y for NPEM is not sufficient because additional (Dell-specific) IPMI drivers need to be loaded as well for NPEM register access to work through _DSM. > > +void pci_npem_create(struct pci_dev *dev) > > +{ > > + const struct npem_ops *ops = &npem_ops; > > + int pos = 0, ret; > > + u32 cap; > > + > > + if (!npem_has_dsm(dev)) { > > + pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_NPEM); > > + if (pos == 0) > > + return; > > + > > + if (pci_read_config_dword(dev, pos + PCI_NPEM_CAP, &cap) != 0 || > > + (cap & PCI_NPEM_CAP_CAPABLE) == 0) > > + return; > > + } else { > > + /* > > + * OS should use the DSM for LED control if it is available > > + * PCI Firmware Spec r3.3 sec 4.7. > > + */ > > + return; > > + } > > I know this is sort of a transient state since the next patch adds > full _DSM support, but I do think (a) the fact that NPEM will stop > working simply because firmware adds _DSM support is unexpected > behavior, and (b) npem_has_dsm() and the other ACPI-related stuff > would fit better in the next patch. It's a little strange to have > them mixed here. PCI Firmware Spec r3.3 sec 4.7 says: "OSPM should use this _DSM when available. If this _DSM is not available, OSPM should use Native PCIe Enclosure Management (NPEM) or SCSI Enclosure Services (SES) instead, if available." I realize that a "should" is not a "must", so Linux would in principle be allowed to use direct register access despite presence of the _DSM. However that doesn't feel safe. If the _DSM is present, I think it's fair to assume that the platform firmware wants to control at least a portion of the LEDs itself. Accessing those LEDs directly, behind the platform firmware's back, may cause issues. Not exposing the LEDs to the user in the _DSM case therefore seems safer. Which is why the ACPI stuff to query for _DSM presence is already in this patch instead of the succeeding one. Thanks, Lukas