On Sun, Feb 05, 2023 at 05:02:29PM -0800, Dan Williams wrote: > Summary: > -------- > > CXL RAM support allows for the dynamic provisioning of new CXL RAM > regions, and more routinely, assembling a region from an existing > configuration established by platform-firmware. The latter is motivated > by CXL memory RAS (Reliability, Availability and Serviceability) > support, that requires associating device events with System Physical > Address ranges and vice versa. > > The 'Soft Reserved' policy rework arranges for performance > differentiated memory like CXL attached DRAM, or high-bandwidth memory, > to be designated for 'System RAM' by default, rather than the device-dax > dedicated access mode. That current device-dax default is confusing and > surprising for the Pareto of users that do not expect memory to be > quarantined for dedicated access by default. Most users expect all > 'System RAM'-capable memory to show up in FREE(1). > > > Details: > -------- > > Recall that the Linux 'Soft Reserved' designation for memory is a > reaction to platform-firmware, like EFI EDK2, delineating memory with > the EFI Specific Purpose Memory attribute (EFI_MEMORY_SP). An > alternative way to think of that attribute is that it specifies the > *not* general-purpose memory pool. It is memory that may be too precious > for general usage or not performant enough for some hot data structures. > However, in the absence of explicit policy it should just be 'System > RAM' by default. > > Rather than require every distribution to ship a udev policy to assign > dax devices to dax_kmem (the device-memory hotplug driver) just make > that the kernel default. This is similar to the rationale in: > > commit 8604d9e534a3 ("memory_hotplug: introduce CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE") > > With this change the relatively niche use case of accessing this memory > via mapping a device-dax instance can be achieved by building with > CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE=n, or specifying > memhp_default_state=offline at boot, and then use: > > daxctl reconfigure-device $device -m devdax --force > > ...to shift the corresponding address range to device-dax access. > > The process of assembling a device-dax instance for a given CXL region > device configuration is similar to the process of assembling a > Device-Mapper or MDRAID storage-device array. Specifically, asynchronous > probing by the PCI and driver core enumerates all CXL endpoints and > their decoders. Then, once enough decoders have arrived to a describe a > given region, that region is passed to the device-dax subsystem where it > is subject to the above 'dax_kmem' policy. This assignment and policy > choice is only possible if memory is set aside by the 'Soft Reserved' > designation. Otherwise, CXL that is mapped as 'System RAM' becomes > immutable by CXL driver mechanisms, but is still enumerated for RAS > purposes. > > This series is also available via: > > https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/log/?h=for-6.3/cxl-ram-region > > ...and has gone through some preview testing in various forms. > > --- Tested-by: Fan Ni <fan.ni@xxxxxxxxxxx> Run the following tests with the patch (with the volatile support at qemu). Note: cxl related code are compiled as modules and loaded before used. For pmem setup, tried three topologies (1HB1RP1Mem, 1HB2RP2Mem, 1HB2RP4Mem with a cxl switch). The memdev is either provided in the command line when launching qemu or hot added to the guest with device_add command in qemu monitor. The following operations are performed, 1. create-region with cxl cmd 2. create name-space with ndctl cmd 3. convert cxl mem to ram with daxctl cmd 4. online the memory with daxctl cmd 5. Let app use the memory (numactl --membind=1 htop) Results: No regression. For volatile memory (hot add with device_add command), mainly tested 1HB1RP1Mem case (passthrough). 1. the device can be correctly discovered after hot add (cxl list, may need cxl enable-memdev) 2. creating ram region (cxl create-region) succeeded, after creating the region, a dax device under /dev/ is shown. 3. online the memory passes, and the memory is shown on another NUMA node. 4. Let app use the memory (numactl --membind=1 htop) passed. > > Dan Williams (18): > cxl/Documentation: Update references to attributes added in v6.0 > cxl/region: Add a mode attribute for regions > cxl/region: Support empty uuids for non-pmem regions > cxl/region: Validate region mode vs decoder mode > cxl/region: Add volatile region creation support > cxl/region: Refactor attach_target() for autodiscovery > cxl/region: Move region-position validation to a helper > kernel/range: Uplevel the cxl subsystem's range_contains() helper > cxl/region: Enable CONFIG_CXL_REGION to be toggled > cxl/region: Fix passthrough-decoder detection > cxl/region: Add region autodiscovery > tools/testing/cxl: Define a fixed volatile configuration to parse > dax/hmem: Move HMAT and Soft reservation probe initcall level > dax/hmem: Drop unnecessary dax_hmem_remove() > dax/hmem: Convey the dax range via memregion_info() > dax/hmem: Move hmem device registration to dax_hmem.ko > dax: Assign RAM regions to memory-hotplug by default > cxl/dax: Create dax devices for CXL RAM regions > > > Documentation/ABI/testing/sysfs-bus-cxl | 64 +- > MAINTAINERS | 1 > drivers/acpi/numa/hmat.c | 4 > drivers/cxl/Kconfig | 12 > drivers/cxl/acpi.c | 3 > drivers/cxl/core/core.h | 7 > drivers/cxl/core/hdm.c | 8 > drivers/cxl/core/pci.c | 5 > drivers/cxl/core/port.c | 34 + > drivers/cxl/core/region.c | 848 ++++++++++++++++++++++++++++--- > drivers/cxl/cxl.h | 46 ++ > drivers/cxl/cxlmem.h | 3 > drivers/cxl/port.c | 26 + > drivers/dax/Kconfig | 17 + > drivers/dax/Makefile | 2 > drivers/dax/bus.c | 53 +- > drivers/dax/bus.h | 12 > drivers/dax/cxl.c | 53 ++ > drivers/dax/device.c | 3 > drivers/dax/hmem/Makefile | 3 > drivers/dax/hmem/device.c | 102 ++-- > drivers/dax/hmem/hmem.c | 148 +++++ > drivers/dax/kmem.c | 1 > include/linux/dax.h | 7 > include/linux/memregion.h | 2 > include/linux/range.h | 5 > lib/stackinit_kunit.c | 6 > tools/testing/cxl/test/cxl.c | 146 +++++ > 28 files changed, 1355 insertions(+), 266 deletions(-) > create mode 100644 drivers/dax/cxl.c > > base-commit: 172738bbccdb4ef76bdd72fc72a315c741c39161 >