On Mon, Apr 4, 2022 at 8:16 PM Dan Williams <dan.j.williams@xxxxxxxxx> wrote: > > On Mon, Apr 4, 2022 at 9:00 AM Rafael J. Wysocki <rafael@xxxxxxxxxx> wrote: > > > > On Sun, Apr 3, 2022 at 2:58 AM Dan Williams <dan.j.williams@xxxxxxxxx> wrote: > > > > > > The CXL specification claims S3 support at a hardware level, but at a > > > system software level there are some missing pieces. Section 9.4 rightly > > > claims that "CXL mem adapters may need aux power to retain memory > > > context across S3", but there is no enumeration mechanism for the OS to > > > determine if a given adapter has that support. Moreover the save state > > > and resume image for the system may inadvertantly end up in a CXL device > > > that needs to be restored before the save state is recoverable. I.e. a > > > circular dependency that is not resolvable without a third party > > > save-area. > > > > > > Arrange for the cxl_mem driver to fail S3 attempts. This still nominaly > > > allows for suspend, but requires unbinding all CXL memory devices before > > > the suspend to ensure the typical DRAM flow is taken. The cxl_mem unbind > > > flow is intended to also tear down all CXL memory regions associated > > > with a given cxl_memdev. > > > > > > It is reasonable to assume that any device participating in a System RAM > > > range published in the EFI memory map is covered by aux power and > > > save-area outside the device itself. So this restriction can be > > > minimized in the future once pre-existing region enumeration support > > > arrives, and perhaps a spec update to clarify if the EFI memory is > > > sufficent for determining the range of devices managed by > > > platform-firmware for S3 support. > > > > > > Cc: "Rafael J. Wysocki" <rafael@xxxxxxxxxx> > > > Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx> > > > > A few thoughts: > > > > 1. I don't think it is necessary to fail suspend-to-idle too (which > > the driver will do after the patch AFAICS). > > Ah true, I missed that this would also disable suspend to idle. > > > 2. Should hibernation fail too? From the description above it looks > > like that should be the case. > > Yes, any CXL address range that was provisioned by the OS would need > some off-device save area for the device-state which seems difficult > to support in the general case. So it should cause errors to be returned from the hibernation path too, ideally before the freezing of tasks. > > 3. If "deep"suspend is going to fail every time, it may be better to > > prevent "deep" from being written to /sys/power/mem_sleep instead of > > failing suspend in progress, especially after freezing user space. > > Yeah, that sounds much better, let me explore that option.