Re: [PATCH v2 2/3] PCI/NPEM: Add Native PCIe Enclosure Management support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 19 Jun 2024 11:08:26 +0200
Lukas Wunner <lukas@xxxxxxxxx> wrote:

> On Tue, Jun 18, 2024 at 12:32:33PM -0700, Dan Williams wrote:
> > It strikes me that playing these initcall games is a losing battle and
> > that this case would be best served by late loading of NPEM
> > functionality.
> > 
> > Something similar is happening with PCI device security where the
> > enabling depends on a third-party driver for a platform
> > "security-manager" (TSM) to arrive.
> > 
> > The approach there is to make the functionality independent of
> > device-discovery vs TSM driver load order. So, if the TSM driver is
> > loaded early then pci_init_capabilities() can immediately enable the
> > functionality. If the TSM driver is loaded *after* some devices have already
> > gone through pci_init_capabilities(), then that event is responsible for
> > doing for_each_pci_dev() to catch up on devices that missed their
> > initial chance to turn on device security details.
> > 
> > So, for NPEM, the thought would be to implement the same rendezvous
> > flow, i.e. s/TSM/NPEM/.  
> 
> A different viewpoint is that these issues are caused by the
> "division of labor" between OS kernel and platform firmware.
> 
> In the NPEM case, Dell servers require the OS to call firmware
> to change LEDs.  But before OS can do that, OS has to initialize
> a certain other interface with firmware.
> 
> In the TSM case, Intel TDX Connect or AMD SEV-TIO require OS to
> ask firmware to perform certain authentication steps with devices,
> wherefore OS has to provide another interface to facilitate
> communication with the device.
> 
> It's a complexity nightmare exacerbated by vendor-specific quirks.
> 
> Which is why I'm arguing that firmware functionality (e.g. TDX module)
> should be constrained to the absolute minimum and the OS should be
> in control of as much as possible.  That's the approach Apple has
> been following as it's the only way to achieve their close interplay
> between hardware and software without making things too complex.
> 
> It seems what's keeping this series from working on Dell servers is
> primarily that the driver wants to read out LED status on probe.
> So I've recommended to Mariusz off-list to do that lazily if possible,
> i.e. on first read of a LED's status.
> 
> Then if users do try to read or write LED status on Dell servers without
> loading IPMI modules first, they get to keep the pieces, sorry. :(

> 
Initially, I thought that Dan suggestion is the best option but after taking
into account use cases of the driver and times provided by Stuart - lazy
loading wins.

As a led application maintainer, I can accept fact that I cannot impose led for
a while and errors will be reported, that is fine. I can left a hint why it is
happening to user.

I would be a nightmare to get new LED controller after some time if LED
interface appearance is delayed. It is much worse from user perspective because
no device means that I have no information in userland. I cannot determine if
something is going to be up soon so I will report disks as not supported -
unnecessary maintenance hell. I may receive a lot of issues.

Stuart, please give me some time to apply suggestions and introduce lazy
approach. I'm working on it!

Thanks,
Mariusz




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux