On Thu, Feb 16, 2023 at 3:36 AM Dan Williams <dan.j.williams@xxxxxxxxx> wrote: > > While experimenting with CXL region removal the following corruption of > /proc/iomem appeared. > > Before: > f010000000-f04fffffff : CXL Window 0 > f010000000-f02fffffff : region4 > f010000000-f02fffffff : dax4.0 > f010000000-f02fffffff : System RAM (kmem) > > After (modprobe -r cxl_test): > f010000000-f02fffffff : **redacted binary garbage** > f010000000-f02fffffff : System RAM (kmem) > > ...and testing further the same is visible with persistent memory > assigned to kmem: > > Before: > 480000000-243fffffff : Persistent Memory > 480000000-57e1fffff : namespace3.0 > 580000000-243fffffff : dax3.0 > 580000000-243fffffff : System RAM (kmem) > > After (ndctl disable-region all): > 480000000-243fffffff : Persistent Memory > 580000000-243fffffff : ***redacted binary garbage*** > 580000000-243fffffff : System RAM (kmem) > > The corrupted data is from a use-after-free of the "dax4.0" and "dax3.0" > resources, and it also shows that the "System RAM (kmem)" resource is > not being removed. The bug does not appear after "modprobe -r kmem", it > requires the parent of "dax4.0" and "dax3.0" to be removed which > re-parents the leaked "System RAM (kmem)" instances. Those in turn > reference the freed resource as a parent. > > First up for the fix is release_mem_region_adjustable() needs to > reliably delete the resource inserted by add_memory_driver_managed(). > That is thwarted by a check for IORESOURCE_SYSRAM that predates the > dax/kmem driver, from commit: > > 65c78784135f ("kernel, resource: check for IORESOURCE_SYSRAM in release_mem_region_adjustable") > > That appears to be working around the behavior of HMM's > "MEMORY_DEVICE_PUBLIC" facility that has since been deleted. With that > check removed the "System RAM (kmem)" resource gets removed, but > corruption still occurs occasionally because the "dax" resource is not > reliably removed. > > The dax range information is freed before the device is unregistered, so > the driver can not reliably recall (another use after free) what it is > meant to release. Lastly if that use after free got lucky, the driver > was covering up the leak of "System RAM (kmem)" due to its use of > release_resource() which detaches, but does not free, child resources. > The switch to remove_resource() forces remove_memory() to be responsible > for the deletion of the resource added by add_memory_driver_managed(). > > Fixes: c2f3011ee697 ("device-dax: add an allocation interface for device-dax instances") > Cc: <stable@xxxxxxxxxxxxxxx> > Cc: Oscar Salvador <osalvador@xxxxxxx> > Cc: David Hildenbrand <david@xxxxxxxxxx> > Cc: Pavel Tatashin <pasha.tatashin@xxxxxxxxxx> > Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx> Reviewed-by: Pasha Tatashin <pasha.tatashin@xxxxxxxxxx> Thanks, Pasha