Re: [PATCH v2 2/3] PCI/NPEM: Add Native PCIe Enclosure Management support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 6/18/2024 3:56 AM, Mariusz Tkaczyk wrote:
On Sat, 15 Jun 2024 12:33:45 +0200
Lukas Wunner <lukas@xxxxxxxxx> wrote:

On Fri, Jun 14, 2024 at 04:06:14PM -0500, stuart hayes wrote:
On 5/28/2024 8:19 AM, Mariusz Tkaczyk wrote:
+static int pci_npem_init(struct pci_dev *dev, const struct npem_ops *ops,
+			 int pos, u32 caps)
+{
[...]
+	ret = ops->get_active_indications(npem, &active);
+	if (ret) {
+		npem_free(npem);
+		return -EACCES;
+	}

Failing pci_npem_init() if this ops->get_active_indications() fails
will keep this from working on most (all?) Dell servers, because the
_DSM get/set functions use an IPMI operation region to get/set the
active LEDs, and this is getting run before the IPMI drivers and
acpi_ipmi module (which provides ACPI access to IPMI operation
regions) get loaded.  (GET_SUPPORTED_STATES works without IPMI.)

CONFIG_ACPI_IPMI is tristate.  Even if it's built-in, the
module_initcall() becomes a device_initcall().

PCI enumeration happens from a subsys_initcall(), way earlier
than device_initcall().

If you set CONFIG_ACPI_IPMI=y and change the module_initcall() in
drivers/acpi/acpi_ipmi.c to arch_initcall(), does the issue go away?

That seems to be the best option. Please test Lukas proposal and let me know.
Shouldn't I make a dependency to ACPI_IPMI in Kconfig (with optional comment
about initcall)?

+config PCI_NPEM
+	bool "Native PCIe Enclosure Management"
+	depends on LEDS_CLASS=y
+	depends on ACPI_IPMI=y


I tried it just to be sure, but changing the module_initcall() to an
arch_initcall() in acpi_ipmi.c (and compiling it into the kernel) doesn't
help... acpi_ipmi is loaded before npem, but the underlying IPMI drivers
that acpi_ipmi uses to provide the IPMI operation region in ACPI aren't
loaded until later... acpi_ipmi needs the IPMI device.

I'll try to think of another solution.  I don't see how to get the IPMI
drivers to load before PCI devices are enumerated, so it seems to me that
the only way to get it to work from the moment the LEDs show up in sysfs
is to somehow delay the npem driver initialization of the LEDs (at least
for _DSM) or use something to trigger a rescan later.

I notice that acpi_ipmi provides a function to wait for it to have an
IPMI device registered (acpi_wait_for_acpi_ipmi()), which is used by
acpi_power_meter.c.  It would be kind of awkward to use that... it just
just waits for a completion (with a 2 second timeout)--it isn't a notifier
or callback.  On my system, the npem driver failed to run a _DSM at
0.72 seconds, and ipmi_si driver didn't initialize the IPMI controller
until 9.54 seconds.


(2) providing a mechanism to trigger this driver to rescan a PCI
     device later from user space

If this was a regular device driver and -EPROBE_DEFER was returned if
IPMI drivers aren't loaded yet, then this would be easy to solve.
But neither is the case here.

Of course it's possible to exercise the "remove" and "rescan" attributes
in sysfs to re-enumerate the device but that's not a great solution.

We cannot expect from users to know and do that. If we cannot configure driver,
we should not register it. We have to guarantee that IMPI commands are
available at the point we are using them.

There is not better place to add _DSM device than its enumeration and I have a
feeling than sooner or later someone else will reach this problem so it would
be better for community to solve it now.



(3) don't cache the active LEDs--just get the active states using
     NPEM/DSM when brightness is read, and do a get/modify/set when
     setting the brightness... then get_active_indications() wouldn't
     need to be called during init.

Not good.  The LEDs are published in sysfs from a subsys_initcall().
Brightness changes through sysfs could in theory immediately happen
once they're published.  If acpi_ipmi is a module and gets loaded way
later, we'd still have to cope with Set State or Get State DSMs going
nowhere.


Agree. I can do it and it should be safe but it is not addressing the issue.
We would limit time race window but we will not close it.

Thanks,
Mariusz




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux