Ben Cheatham wrote: > Hey Dan, > > Sorry that I went radio silent on this, I've been thinking about the approach to take > from here on out. More below! > > On 2/15/24 5:43 PM, Dan Williams wrote: > > [snip] > > >> The series also introduces a PCIe port bus driver dependency on the CXL > >> core. I expect to be able to remove that when when my team submits > >> patches for a future rework of the PCIe port bus driver. > > > > We have time to wait for that work to settle. Do not make the job harder > > in the near term by adding one more dependency to unwind. > > > > For sure, I'll withhold any work I do that touches the port bus driver > until that's sent upstream. I'll send anything that doesn't touch > the port driver, i.e. CXL region changes, as RFC as well. > > [snip] > > > So if someone says, "yes I can tolerate losing a root port at a time and > > I can tolerate deploying my workloads with userspace memory management, > > and this is preferable to a reboot", then maybe Linux should entertain > > CXL Error Isolation. Until such an end use case gains clear uptake it > > seems too early to worry about plumbing the notification mechanism. > > So in my mind the primary use case for this feature is in a > server/data center environment. Losing a root port and the attached > memory is definitely preferable to a reboot in this environment, so I > think that isolation would still be useful even if it isn't > plug-and-play. That is not sufficient. This feature needs an implementation partner to work through the caveats. > I agree with your assessment of what has to happen before we can make > this feature work. From what I understand these are the main > requirements for making isolation work: Requirement 1 is "Find customer". > 1. The memory region can't be onlined as System RAM > 2. The region needs to be mapped as device-dax > 3. There need to be safeguards such that someone can't remap the > region to System RAM with error isolation enabled (added by me) No, this policy does not belong in the kernel. > Considering these all have to do with mm, I think that's a good place > to start. What I'm thinking of starting with is adding a module > parameter (or config option) to enable isolation for CXL dax regions > by default, as well as a sysfs option for toggling error isolation for > the CXL region. When the module parameter is specified, the CXL region > driver would create the region as a device-dax region by default > instead of onlining the region as system RAM. The sysfs would allow > toggling error isolation for CXL regions that are already device-dax. No, definitely not going to invent module paramter policy for this. The policy to not online dax regions is already available. > That would cover requirements 1 & 2, but would still allow someone to > reconfigure the region as a system RAM region with error isolation > still enabled using sysfs/daxctl. To make sure the this doesn't > happen, my plan is to have the CXL region driver automatically disable > error isolation when the underlying memory region is going offline, > then check to make sure the memory isn't onlined as System RAM before > re-enabling isolation. Why would you ever need to disable isolation? If it is present it at least nominally allows software to keep running vs hang awaiting a watchdog-timer reboot. > So, with that in mind, isolation would most likely go from something > that is enabled by default when compiled in to a feature for > correctly-configured CXL regions that has to be enabled by the end > user. If that is sounds good, here's what my roadmap would be going > forward: Again, this needs a customer to weigh the mitigations that the kernel needs to carry. > 1. Enable creating device-dax mapped CXL regions by default Already exists. > 2. Add support for checking a CXL region meets the requirements > for enabling isolation (i.e. all devices in > dax region(s) are device-dax) Regions should not know or care if isolation is enabled, the implications are identical to force unbinding the region driver. > 3. Add back in the error isolation enablement and notification > mechanisms in this patch series Not until it is clear that this feature has deployment value.