Sorry for the delay, I got a bit busy for a while there. Responses inline. On 3/25/24 10:54 AM, Dan Williams wrote: > Ben Cheatham wrote: >> Hey Dan, >> >> Sorry that I went radio silent on this, I've been thinking about the approach to take >> from here on out. More below! >> >> On 2/15/24 5:43 PM, Dan Williams wrote: >> >> [snip] >> >>>> The series also introduces a PCIe port bus driver dependency on the CXL >>>> core. I expect to be able to remove that when when my team submits >>>> patches for a future rework of the PCIe port bus driver. >>> >>> We have time to wait for that work to settle. Do not make the job harder >>> in the near term by adding one more dependency to unwind. >>> >> >> For sure, I'll withhold any work I do that touches the port bus driver >> until that's sent upstream. I'll send anything that doesn't touch >> the port driver, i.e. CXL region changes, as RFC as well. >> >> [snip] >> >>> So if someone says, "yes I can tolerate losing a root port at a time and >>> I can tolerate deploying my workloads with userspace memory management, >>> and this is preferable to a reboot", then maybe Linux should entertain >>> CXL Error Isolation. Until such an end use case gains clear uptake it >>> seems too early to worry about plumbing the notification mechanism. >> >> So in my mind the primary use case for this feature is in a >> server/data center environment. Losing a root port and the attached >> memory is definitely preferable to a reboot in this environment, so I >> think that isolation would still be useful even if it isn't >> plug-and-play. > > That is not sufficient. This feature needs an implementation partner to > work through the caveats. > Got it, I'll come back to this if/when I have a customer willing to commit to this. >> I agree with your assessment of what has to happen before we can make >> this feature work. From what I understand these are the main >> requirements for making isolation work: > > Requirement 1 is "Find customer". > >> 1. The memory region can't be onlined as System RAM >> 2. The region needs to be mapped as device-dax >> 3. There need to be safeguards such that someone can't remap the >> region to System RAM with error isolation enabled (added by me) > > No, this policy does not belong in the kernel. > Understood. >> Considering these all have to do with mm, I think that's a good place >> to start. What I'm thinking of starting with is adding a module >> parameter (or config option) to enable isolation for CXL dax regions >> by default, as well as a sysfs option for toggling error isolation for >> the CXL region. When the module parameter is specified, the CXL region >> driver would create the region as a device-dax region by default >> instead of onlining the region as system RAM. The sysfs would allow >> toggling error isolation for CXL regions that are already device-dax. > > No, definitely not going to invent module paramter policy for this. The > policy to not online dax regions is already available. > >> That would cover requirements 1 & 2, but would still allow someone to >> reconfigure the region as a system RAM region with error isolation >> still enabled using sysfs/daxctl. To make sure the this doesn't >> happen, my plan is to have the CXL region driver automatically disable >> error isolation when the underlying memory region is going offline, >> then check to make sure the memory isn't onlined as System RAM before >> re-enabling isolation. > > Why would you ever need to disable isolation? If it is present it at > least nominally allows software to keep running vs hang awaiting a > watchdog-timer reboot. > That's a good point. I think I was just looking at this too long and got a bit lost in the sauce, I'll keep this in mind for if/when I come back to this. >> So, with that in mind, isolation would most likely go from something >> that is enabled by default when compiled in to a feature for >> correctly-configured CXL regions that has to be enabled by the end >> user. If that is sounds good, here's what my roadmap would be going >> forward: > > Again, this needs a customer to weigh the mitigations that the kernel > needs to carry. > >> 1. Enable creating device-dax mapped CXL regions by default > > Already exists. > >> 2. Add support for checking a CXL region meets the requirements >> for enabling isolation (i.e. all devices in >> dax region(s) are device-dax) > > Regions should not know or care if isolation is enabled, the > implications are identical to force unbinding the region driver. > Got it. Thanks, Ben >> 3. Add back in the error isolation enablement and notification >> mechanisms in this patch series > > Not until it is clear that this feature has deployment value.