Fan Ni wrote: > On Sun, Feb 05, 2023 at 05:02:29PM -0800, Dan Williams wrote: > > > > Summary: > > -------- > > > > CXL RAM support allows for the dynamic provisioning of new CXL RAM > > regions, and more routinely, assembling a region from an existing > > configuration established by platform-firmware. The latter is motivated > > by CXL memory RAS (Reliability, Availability and Serviceability) > > support, that requires associating device events with System Physical > > Address ranges and vice versa. > > > > The 'Soft Reserved' policy rework arranges for performance > > differentiated memory like CXL attached DRAM, or high-bandwidth memory, > > to be designated for 'System RAM' by default, rather than the device-dax > > dedicated access mode. That current device-dax default is confusing and > > surprising for the Pareto of users that do not expect memory to be > > quarantined for dedicated access by default. Most users expect all > > 'System RAM'-capable memory to show up in FREE(1). > > > > > > Details: > > -------- > > > > Recall that the Linux 'Soft Reserved' designation for memory is a > > reaction to platform-firmware, like EFI EDK2, delineating memory with > > the EFI Specific Purpose Memory attribute (EFI_MEMORY_SP). An > > alternative way to think of that attribute is that it specifies the > > *not* general-purpose memory pool. It is memory that may be too precious > > for general usage or not performant enough for some hot data structures. > > However, in the absence of explicit policy it should just be 'System > > RAM' by default. > > > > Rather than require every distribution to ship a udev policy to assign > > dax devices to dax_kmem (the device-memory hotplug driver) just make > > that the kernel default. This is similar to the rationale in: > > > > commit 8604d9e534a3 ("memory_hotplug: introduce CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE") > > > > With this change the relatively niche use case of accessing this memory > > via mapping a device-dax instance can be achieved by building with > > CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE=n, or specifying > > memhp_default_state=offline at boot, and then use: > > > > daxctl reconfigure-device $device -m devdax --force > > > > ...to shift the corresponding address range to device-dax access. > > > > The process of assembling a device-dax instance for a given CXL region > > device configuration is similar to the process of assembling a > > Device-Mapper or MDRAID storage-device array. Specifically, asynchronous > > probing by the PCI and driver core enumerates all CXL endpoints and > > their decoders. Then, once enough decoders have arrived to a describe a > > given region, that region is passed to the device-dax subsystem where it > > is subject to the above 'dax_kmem' policy. This assignment and policy > > choice is only possible if memory is set aside by the 'Soft Reserved' > > designation. Otherwise, CXL that is mapped as 'System RAM' becomes > > immutable by CXL driver mechanisms, but is still enumerated for RAS > > purposes. > > > > This series is also available via: > > > > https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/log/?h=for-6.3/cxl-ram-region > > > > ...and has gone through some preview testing in various forms. > > > > --- > > Tested-by: Fan Ni <fan.ni@xxxxxxxxxxx> > > > Run the following tests with the patch (with the volatile support at qemu). > Note: cxl related code are compiled as modules and loaded before used. > > For pmem setup, tried three topologies (1HB1RP1Mem, 1HB2RP2Mem, 1HB2RP4Mem with > a cxl switch). The memdev is either provided in the command line when launching > qemu or hot added to the guest with device_add command in qemu monitor. > > The following operations are performed, > 1. create-region with cxl cmd > 2. create name-space with ndctl cmd > 3. convert cxl mem to ram with daxctl cmd > 4. online the memory with daxctl cmd > 5. Let app use the memory (numactl --membind=1 htop) > > Results: No regression. > > For volatile memory (hot add with device_add command), mainly tested 1HB1RP1Mem > case (passthrough). > 1. the device can be correctly discovered after hot add (cxl list, may need > cxl enable-memdev) > 2. creating ram region (cxl create-region) succeeded, after creating the > region, a dax device under /dev/ is shown. > 3. online the memory passes, and the memory is shown on another NUMA node. > 4. Let app use the memory (numactl --membind=1 htop) passed. Thank you, Fan!