Alison Schofield wrote: [..] > > Hi Nathan, this patch hit on all the mechanisms I would expect, but upon > > reading it there is an opportunity to zoom out and do something blunter > > than the surgical precision of this current proposal. > > > > In other words, I appreciate the consideration of potential corner > > cases, but for overall maintainability this should aim to be an all or > > nothing approach. > > > > Specifically, at the first sign of trouble, any CXL sub-driver probe > > failure or region enumeration timeout, that the entire CXL topology be > > torn down (trigger the equivalent of ->remove() on the ACPI0017 device), > > and the deferred Soft Reserved ranges registered as if cxl_acpi was not > > present (implement a fallback equivalent to hmem_register_devices()). > > > > No need to trim resources as regions arrive, just tear down everything > > setup in the cxl_acpi_probe() path with devres_release_all(). > > > > So, I am thinking export a flag from the CXL core that indicates whether > > any conflict with platform-firmware established CXL regions has > > occurred. > > > > Read that flag from an cxl_acpi-driver-launched deferred workqueue that > > is awaiting initial device probing to quiesce. If that flag indicates a > > CXL enumeration failure then trigger devres_release_all() on the > > ACPI0017 platform device and follow that up by walking the deferred Soft > > Reserve resources to register raw (unparented by CXL regions) dax > > devices. > > > > This reads like a 'poison pill' case that is in addition to the original > use cases that inspired this patch. I'm not sure I'm getting the 'all or > nothing' language. Read through this recent regression thread: http://lore.kernel.org/20241202111941.2636613-1-raghavendra.kt@xxxxxxx That is evidence that the ways in which platform CXL configurations can confuse the CXL subsystem are myriad and seemingly growing. So this change of heart from me is realizing that the solution needed here is less about how to recover the Soft Reserved ranges after CXL successfully takes over, it is about abandoning the notion that the driver should continue to best-effort fumble along after discovering that a fundamental assumption has been violated. For example this would also mitigate the "Extended Linear Cache" enumeration problem: http://lore.kernel.org/20241204224827.2097263-1-dave.jiang@xxxxxxxxx ...where the driver fools itself into thinking it knows what is happening just based on reading hardware decoders, but not realizing that DDR aliases into the same physical address range. Yes, you lose important CXL subsystem services like address translation, but it is arguably worse to confidently offer a bad translation based on wrong assumption than offer no translation with an undecorated dax device, and in the meantime the memory is usable. The driver should stop guessing and say, "there is reason to believe I have no idea what this platform is actually doing. The platform firmware seems to think it knows what is coherent / enabled memory range. In the interests of not dropping precious resources on the floor, or proceeding with potential misconfiguration of the memory map, just revert to 100% memory-mapped enumerated resources without CXL decoration." > The patch was doing Case 1) and 2). This poison-pill approach is 3) So it is not "poison-pill" or "sour grapes", it is "irresponsible to continue when fundamental assumptions are violated, and wasteful to leave the system with reduced memory capacity when platform firmware put it in the memory map". The "unwind SR on CXL region destruction" use case comes along for the ride as a bonus feature. > > 1) SR aligns exactly with a CXL Window. > This patch removes that SR so the address space is available for reuse if > an auto-region is torn down. > > 2) SR is larger than a CXL Window. > Where the SR aligns with the CXL Window the SR is handled as in 1). > The SR leftovers are released to DAX. > > 3) Failure in auto-region assembly. > New case: tear down ACPI0017 and release all SRs. release all SRs back to default enumeration behavior. > So, after writing the above, maybe I get what the 'all' case is. > Are you suggesting we stop trimming and ignore leftovers and just > consider both 1) and 2) the 'all' case? No SR ever makes it into > the iomem resource tree so all resources are available for reuse > after teardown. And then there is 3), the nothing case! I get that. > > It will waste some leftovers that could have been handed to DAX. > I know the SR greater than CXL Window came in handy for hotplug when > the SRs were not released, but I never understood if SRs greater > than CXL Window intended to serve any purpose. ...and that is another example to prove the point. If the CXL subsystem can not definitively say why it is trimming too large SR resources based on the CXL Window size it should default to giving the end user all the memory promised in the EFI memory map.