On Sat, May 14, 2022 at 6:31 AM Lukas Wunner <lukas@xxxxxxxxx> wrote: > > On Wed, May 11, 2022 at 12:42:24PM -0700, Dan Williams wrote: > > I think power-management effects relative to IDE is a soft spot of the > > specification. > > When resuming from system sleep, the kernel restores a device's > config space in pci_pm_resume_noirq(), then calls the driver's > ->resume_noirq() callback. The driver is free to assume that > the device is accessible und usable at that point. > > IDE breaks that contract if establishment of an SPDM session > depends on user space. We can't call out to user space for > authentication during the resume_noirq phase because interrupts > are still disabled. > > Drivers would have to be aware that IDE has not yet been > re-established and refrain from accessing the device. > Any child devices of the PCI device cannot be resumed > until then. Suspend has larger issues with CXL: https://lore.kernel.org/linux-cxl/165066828317.3907920.5690432272182042556.stgit@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/ ...so IDE is just one more problem on top that requires disabling suspend. Unless / until firmware takes responsibility for setting up IDE I am not seeing a clean option for allowing the link to go down. > Ideally we'd want IDE to be transparent to drivers. > That's impossible if their access to devices is forbidden > after system sleep for an indefinite amount of time. > > Runtime PM has similar issues as system sleep if the device > was in D3cold. > > Reliance on user space also entails a risk of deadlocks: > Let's say user space process A accesses a PCI device, > the kernel runtime resumes the device and calls out to > user space process B to authenticate it. If A is holding > a resource that B requires, the two tasks deadlock and > the device never becomes accessible. > > The more I think about it, the more attractive does Jonathan's > in-kernel SPDM approach look. Performing SPDM authentication and > IDE setup in the kernel would allow us to retain all existing > assumptions and behavior around power management and reset recovery, > avoid driver awareness of IDE and avoid deadlocks. I agree with you that userspace coordination has these problems, but they are secondary to the larger problem that hosting memory behind PCI devices causes. > > > > > If setting up an SPDM session is dependent on user space, the kernel > > > would have to leave a device in an inoperable state after runtime resume > > > or reset, until user space gets around to initiate SPDM. > > > > Yes, this seems acceptable from the perspective of server platforms > > that can make the power management vs security tradeoff. > > It seems likely that IDE will not only be used on server platforms. I expect IDE outside of the server space will need to be platform firmware managed. OS managed IDE seems a stopgap to platform firmware validating devices to be within the trusted compute boundary. > I'll see to to it that I provide more review feedback to Jonathan's RFC > series so that we can move forward with this. > > Thanks, > > Lukas