Re: [PATCH v6 2/3] PCI/NPEM: Add Native PCIe Enclosure Management support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Aug 14, 2024 at 04:49:30PM -0500, Bjorn Helgaas wrote:
> On Wed, Aug 14, 2024 at 02:28:59PM +0200, Mariusz Tkaczyk wrote:
> > +	/*
> > +	 * Use lazy loading for active_indications to not play with initcalls.
> > +	 * It is needed to allow _DSM initialization on DELL platforms, where
> > +	 * ACPI_IPMI must be loaded first.
> > +	 */
> > +	unsigned int active_inds_initialized:1;
> 
> What's going on here?  I hope we can at least move this to the _DSM
> patch since it seems related to that, not to the NPEM capability.  I
> don't understand the initcall reference or what "lazy loading" means.

In previous iterations of this series, the status of all LEDs was
read on PCI device enumeration.  That was done so that when user space
reads the brightness is sysfs, it gets the correct value.  The value
is cached, it's not re-read from the register on every brightness read.

(It's not guaranteed that all LEDs are off on enumeration.  E.g. boot
firmware may have fiddled with them, or the enclosure itself may have
turned some of them on by itself, typically the "ok" LED.)

However Stuart reported issues when the _DSM interface is used on
Dell servers, because the _DSM requires IPMI drivers to access the
NPEM registers.  He got a ton of errors when LED status was read on
enumeration because that was simply too early.  Start of thread:

https://lore.kernel.org/all/05455f36-7027-4fd6-8af7-4fe8e483f25c@xxxxxxxxx/

The solution is to read LED status lazily, when brightness is read
or written for the first time through sysfs.  At that point, IPMI
drivers are typically loaded.  Stuart reported success with this
approach.  There is still a possibility that users may see issues
if they access brightness before IPMI drivers are loaded.  Those
drivers may be modules and user space might overzealously try to
access brightness before they're loaded.  Or user space may prevent
them from loading by blacklisting or not installing them.  In which
case users get to keep the pieces.

We discussed various alternative approaches in the above-linked thread
but concluded that this pragmatic solution is the simplest that does
the job for all but the most pathological use cases.

We wanted to make this work on Dell servers, but at the same time
minimize the contortions that we need to go through to accommodate
their quirky implementation.

The code uses lazy initialization of LED status even in the native
NPEM case because it would make the code more complex to use early
initialization for direct NPEM register access and lazy initialization
for _DSM-mediated register access.


> Is there some existing ACPI ordering that guarantees ACPI_IPMI happens
> first?  Why do we need some Dell-specific thing here?
> 
> What is ACPI_IPMI?  I guess it refers to the "acpi_ipmi" module,
> acpi_ipmi.c?

As it turned out in the above-linked thread, just forcing ACPI_IPMI=y
for NPEM is not sufficient because additional (Dell-specific) IPMI
drivers need to be loaded as well for NPEM register access to work
through _DSM.


> > +void pci_npem_create(struct pci_dev *dev)
> > +{
> > +	const struct npem_ops *ops = &npem_ops;
> > +	int pos = 0, ret;
> > +	u32 cap;
> > +
> > +	if (!npem_has_dsm(dev)) {
> > +		pos = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_NPEM);
> > +		if (pos == 0)
> > +			return;
> > +
> > +		if (pci_read_config_dword(dev, pos + PCI_NPEM_CAP, &cap) != 0 ||
> > +		    (cap & PCI_NPEM_CAP_CAPABLE) == 0)
> > +			return;
> > +	} else {
> > +		/*
> > +		 * OS should use the DSM for LED control if it is available
> > +		 * PCI Firmware Spec r3.3 sec 4.7.
> > +		 */
> > +		return;
> > +	}
> 
> I know this is sort of a transient state since the next patch adds
> full _DSM support, but I do think (a) the fact that NPEM will stop
> working simply because firmware adds _DSM support is unexpected
> behavior, and (b) npem_has_dsm() and the other ACPI-related stuff
> would fit better in the next patch.  It's a little strange to have
> them mixed here.

PCI Firmware Spec r3.3 sec 4.7 says:

   "OSPM should use this _DSM when available. If this _DSM is not
    available, OSPM should use Native PCIe Enclosure Management (NPEM)
    or SCSI Enclosure Services (SES) instead, if available."

I realize that a "should" is not a "must", so Linux would in principle
be allowed to use direct register access despite presence of the _DSM.

However that doesn't feel safe.  If the _DSM is present, I think it's
fair to assume that the platform firmware wants to control at least
a portion of the LEDs itself.  Accessing those LEDs directly, behind the
platform firmware's back, may cause issues.  Not exposing the LEDs
to the user in the _DSM case therefore seems safer.

Which is why the ACPI stuff to query for _DSM presence is already in
this patch instead of the succeeding one.

Thanks,

Lukas




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux