On Wed, May 11, 2022 at 1:22 PM Hindman, Gavin <gavin.hindman@xxxxxxxxx> wrote: > > > > >-----Original Message----- > >From: Dan Williams <dan.j.williams@xxxxxxxxx> > >Sent: Wednesday, May 11, 2022 12:42 PM > >To: Lukas Wunner <lukas@xxxxxxxxx> > >Cc: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx>; Hindman, Gavin > ><gavin.hindman@xxxxxxxxx>; Linuxarm <linuxarm@xxxxxxxxxx>; Weiny, Ira > ><ira.weiny@xxxxxxxxx>; Linux PCI <linux-pci@xxxxxxxxxxxxxxx>; linux- > >cxl@xxxxxxxxxxxxxxx; CHUCK_LEVER <chuck.lever@xxxxxxxxxx> > >Subject: Re: [RFC PATCH 0/1] DOE usage with pcie/portdrv > > > >On Wed, May 11, 2022 at 12:14 PM Lukas Wunner <lukas@xxxxxxxxx> wrote: > >> > >> On Mon, May 09, 2022 at 10:48:06AM +0100, Jonathan Cameron wrote: > >> > On Sat, 7 May 2022 12:18:48 +0200 Lukas Wunner <lukas@xxxxxxxxx> > >wrote: > >> > > I'm still somewhat undecided on the kernel vs. user space question. > >> > > >> > Likewise. I feel a few more prototypes are needed to come to clear > >> > conclusion. > >> > >> Gavin Hindman (+cc) raised an important point off-list: > >> > >> When an IDE-capable device is runtime suspended to D3hot and later > >> runtime resumed to D0, it may not preserve its internal state. > >> (The No_Soft_Reset bit in the Power Management Control/Status Register > >> tells us whether the device is capable of preserving internal state > >> over a transition to D3hot, see PCIe r6.0, sec. 7.5.2.2.) > > > >I think power-management effects relative to IDE is a soft spot of the > >specification. If the link goes down then yes, IDE needs to be re-established, > >but as far as I can see that's a policy tradeoff to support runtime reset or > >support link encryption. > > > >> Likewise, when an IDE-capable device is reset (e.g. due to Downstream > >> Port Containment, AER or a bus reset initiated by user space), > >> internal state is lost and must be reconstructed by pci_restore_state(). > >> That state includes the SPDM session or IDE encryption. > >> > >> If setting up an SPDM session is dependent on user space, the kernel > >> would have to leave a device in an inoperable state after runtime > >> resume or reset, until user space gets around to initiate SPDM. > > > >Yes, this seems acceptable from the perspective of server platforms that can > >make the power management vs security tradeoff. > > > > Agree, though more and more we need to be thinking about sustainability and cost-of-ownership and having to keep devices awake in order to meet security goals is somewhat contrary to that objective. I fully realize those are not technical constraints, but IMO should still be considered. Latency for deadline-driven tasks was my original consideration, not just security - power-management features commonly get turned off due to resume latency, and this would appear to have the potential to extend resume latencies even in kernel, let alone waiting for user-space response. Again, obviously not a hard design constraint, but seems worthy of consideration Keep in mind that kernel managed IDE is not much more than a stop-gap to fully attesting that devices are within a goven trusted compute boundary. In that model the kernel is not trusted to establish that validation. Instead that role is reserved for a trusted platform entity. So yes, those are important considerations, but they do not read on the kernel implementation in the near term.