Re: [PATCH v6 2/3] PCI/NPEM: Add Native PCIe Enclosure Management support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 15 Aug 2024 12:42:48 -0500
Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:

> On Thu, Aug 15, 2024 at 07:45:09AM +0200, Lukas Wunner wrote:
> > On Wed, Aug 14, 2024 at 04:49:30PM -0500, Bjorn Helgaas wrote:  
> > > On Wed, Aug 14, 2024 at 02:28:59PM +0200, Mariusz Tkaczyk wrote:  
> > > > +	/*
> > > > +	 * Use lazy loading for active_indications to not play with
> > > > initcalls.
> > > > +	 * It is needed to allow _DSM initialization on DELL
> > > > platforms, where
> > > > +	 * ACPI_IPMI must be loaded first.
> > > > +	 */
> > > > +	unsigned int active_inds_initialized:1;  
> > > 
> > > What's going on here?  I hope we can at least move this to the _DSM
> > > patch since it seems related to that, not to the NPEM capability.  I
> > > don't understand the initcall reference or what "lazy loading" means.  
> > 
> > In previous iterations of this series, the status of all LEDs was
> > read on PCI device enumeration.  That was done so that when user space
> > reads the brightness is sysfs, it gets the correct value.  The value
> > is cached, it's not re-read from the register on every brightness read.
> > 
> > (It's not guaranteed that all LEDs are off on enumeration.  E.g. boot
> > firmware may have fiddled with them, or the enclosure itself may have
> > turned some of them on by itself, typically the "ok" LED.)
> > 
> > However Stuart reported issues when the _DSM interface is used on
> > Dell servers, because the _DSM requires IPMI drivers to access the
> > NPEM registers.  He got a ton of errors when LED status was read on
> > enumeration because that was simply too early.    
> 
> The dependency of _DSM on IPMI sounds like a purely ACPI problem.  Is
> there no mechanism in ACPI to express that dependency?
> 
> If _DSM claims the function is supported before the IPMI driver is
> ready, that sounds like a BIOS defect to me.
> 
> If we're stuck with this, maybe the comment can be reworded.  "Lazy
> loading" in a paragraph that also mentions initcalls and the
> "ACPI_IPMI" module makes it sound like we're talking about loading the
> *module* lazily, not just (IIUC) reading the LED status lazily.
> 
> Maybe it could also explicitly say that the GET_STATE_DSM function
> depends on IPMI.
> 
> I'm unhappy that we're getting our arm twisted here.  If functionality
> depends on IPMI, there really needs to be a way for OSPM to manage
> that dependency.  If we're working around a firmware defect, we need
> to be clear about that.

Hi,

I will move active_inds_initialized:1 to DSM commit and I will add better
justification. For NPEM commit, get_active_indications() will be called once in
pci_npem_init() to avoid referring _DSM specific issues in NPEM commit.

> 
> > > > +void pci_npem_create(struct pci_dev *dev)
> > > > +{
> > > > +	const struct npem_ops *ops = &npem_ops;
> > > > +	int pos = 0, ret;
> > > > +	u32 cap;
> > > > +
> > > > +	if (!npem_has_dsm(dev)) {
> > > > +		pos = pci_find_ext_capability(dev,
> > > > PCI_EXT_CAP_ID_NPEM);
> > > > +		if (pos == 0)
> > > > +			return;
> > > > +
> > > > +		if (pci_read_config_dword(dev, pos + PCI_NPEM_CAP,
> > > > &cap) != 0 ||
> > > > +		    (cap & PCI_NPEM_CAP_CAPABLE) == 0)
> > > > +			return;
> > > > +	} else {
> > > > +		/*
> > > > +		 * OS should use the DSM for LED control if it is
> > > > available
> > > > +		 * PCI Firmware Spec r3.3 sec 4.7.
> > > > +		 */
> > > > +		return;
> > > > +	}  
> > > 
> > > I know this is sort of a transient state since the next patch adds
> > > full _DSM support, but I do think (a) the fact that NPEM will stop
> > > working simply because firmware adds _DSM support is unexpected
> > > behavior, and (b) npem_has_dsm() and the other ACPI-related stuff
> > > would fit better in the next patch.  It's a little strange to have
> > > them mixed here.  
> > 
> > PCI Firmware Spec r3.3 sec 4.7 says:
> > 
> >    "OSPM should use this _DSM when available. If this _DSM is not
> >     available, OSPM should use Native PCIe Enclosure Management (NPEM)
> >     or SCSI Enclosure Services (SES) instead, if available."
> > 
> > I realize that a "should" is not a "must", so Linux would in principle
> > be allowed to use direct register access despite presence of the _DSM.
> > 
> > However that doesn't feel safe.  If the _DSM is present, I think it's
> > fair to assume that the platform firmware wants to control at least
> > a portion of the LEDs itself.  Accessing those LEDs directly, behind the
> > platform firmware's back, may cause issues.  Not exposing the LEDs
> > to the user in the _DSM case therefore seems safer.
> > 
> > Which is why the ACPI stuff to query for _DSM presence is already in
> > this patch instead of the succeeding one.  
> 
> The spec is regrettably vague about this, but that assumption isn't
> unreasonable.  It does deserve a more explicit callout in the commit
> log and probably a dmesg note about why NPEM used to work but no
> longer does.
> 

In fact, there is theoretical case that after firmware update DSM is no longer
available and NPEM is chosen. Given that, I will log chosen backed instead of
trying to predict a change. It is easier to implement it this way. User can
compare working/not-working dmesg logs to see a difference so printing backend
used is enough I think.

Thanks,
Mariusz




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux