On Wed, Feb 26, 2025 at 09:21:17AM -0700, Dave Jiang wrote: With the small fixup needed in Patch 3/4, you can add my tag for the series: Reviewed-by: Alison Schofield <alison.schofield@xxxxxxxxx> > v5: > - Update couple dev_dbg() emits. (Alison) > - Add hpa_alias emits for poison events. (Alison) > - Drop cxlr_hpa_cache_alias() and opencode the one invocation. (Alison) > - See individual patches for detailed changes. > > v4: > - Add alias adjustment for cxl_dpa_to_hpa() (Alison) > - Add check of adjusted region start against CFMWS (Alison) > - Use ULLONG_MAX consistently. (Alison) > - Use hpa_alias0 consistently. (Alison) > - Move devm_add_action_or_reset() to devm_cxl_add_mce_notifier(). (Ming) > - See individual patches for detailed changes. > > v3: > - Drop region to nid function, deadcode. > - Set hpa_alias default to ~0ULL to indicate no alias. (Jonathan) > - Add endpoint check for mce handler. (Ming) > - Add mce notifier unregister. (Ming) > > v2: > - Fix 0-day issues > - Fix checking of cache flag. (Ming) > - Add comment about cache range vs CFMWS. (Ming) > - Update EXPORT_SYMBOL_(). (Jonathan) > - Fix various code comments. (Jonathan) > - Emit hpa_alias0 instead of hpa_alias. (Jonathan) > - Introduce CONFIG_CXL_MCE to address kernel build dep issues. > > v1: > - Drop RFC prefix > - Drop MMIO hole discovery. Will implement if there's real world implementation. > - Drop MCE_PRI_CXL. Use MCE_PRI_UC. (Boris) > - Minor refactors and grammar fixes. (Jonathan) > - Rename 'mode' to 'address_mode'. (Jonathan) > > RFCv2: > - Dropped 1/6 (ACPICA definition merged) > - Change UNKNOWN to RESERVED for cache definition. (Jonathan) > - Fix spelling errors (Jonathan) > - Rename region_res_match_range() to region_res_match_cxl_range(). (Jonathan) > - Add warning when cache is not 1:1 with backing region. (Jonathan) > - Code and comments cleanup. (Jonathan) > - Make MCE code access in CXL arch independent. (Jonathan) > - Fixup 0-day reports. > > Certain systems provide an exclusive caching memory configurations where a > 1:1 layout of DRAM and far memory (FM) such as CXL memory is utilized. In > this configuration, the memory region is provided as a single memory region > to the OS. For example such as below: > > 128GB DRAM 128GB CXL memory > |------------------------------------|------------------------------------| > > The kernel sees the region as a 256G system memory region. Data can reside > in either DRAM or FM with no replication. Hot data is swapped into DRAM by > the hardware behind the scenes. > > This kernel series introduces code to enumerate the side cache by the kernel > when configured in a exclusive-cache configuration. It also adds RAS support > to deal with the aliased memory addresses. > > A new ECN [1] to ACPI HMAT table was introduced and was approved to describe > the "extended-linear" addressing for direct-mapped memory-side caches. A > reserved field in the Memory Side Cache Information Structure of HMAT is > redefined as "Address Mode" where a value of 1 is defined as Extended-linear > mode. This value is valid if the cache is direct mapped. "It indicates that > the associated address range (SRAT.MemoryAffinityStructure.Length) is > comprised of the backing store capacity extended by the cache capacity." By > augmenting the HMAT and SRAT parsing code, this new information can be stored > by the HMAT handling code. > > Current CXL region enumeration code is not enlightened with the side cache > configuration and therefore only presents the region size as the size of the > CXL region. Add support to allow CXL region enumeration code to query the HMAT > handling code and retrieve the information regarding the side cache and adjust > the region size accordingly. This should allow the CXL CLI to display the > full region size rather than just the CXL only region size. > > There are 3 sources where the kernel may be notified that error is detected for > memory. > 1. CXL DRAM event. This is a CXL event that is generated when an error is > detected by the CXL device patrol or demand scrubber. The trace_event is > augmented to display the aliased System Phyiscal Address (SPA) in addition > to the alerted address. However, reporting of memory failure is TBD until > the discussion [2] of failure reporting is settled upstream. > 2. UCNA event from DRAM patrol or demand scrubber. This should eventually go > through the MCE callback chain. > 3. MCE from kernel consume poison. > > It is possible that all 3 sources may report at the same time and all report > at the error. > > For 2 and 3, a MCE notifier callback is registered by the CXL on a per device > basis. The callback will determine if the reported address is in one of the > special regions and offline the aliased address if that is the case. > > [1]: https://lore.kernel.org/linux-cxl/668333b17e4b2_5639294fd@xxxxxxxxxxxxxxxxxxxxxxxxx.notmuch/ > [2]: https://lore.kernel.org/linux-cxl/20240808151328.707869-2-ruansy.fnst@xxxxxxxxxxx/ > > --- > > Dave Jiang (4): > acpi: numa: Add support to enumerate and store extended linear address mode > acpi/hmat / cxl: Add extended linear cache support for CXL > cxl: Add extended linear cache address alias emission for cxl events > cxl: Add mce notifier to emit aliased address for extended linear cache > > Documentation/ABI/stable/sysfs-devices-node | 6 +++ > arch/x86/mm/pat/set_memory.c | 1 + > drivers/acpi/numa/hmat.c | 44 +++++++++++++++++++ > drivers/base/node.c | 2 + > drivers/cxl/Kconfig | 4 ++ > drivers/cxl/core/Makefile | 2 + > drivers/cxl/core/acpi.c | 11 +++++ > drivers/cxl/core/core.h | 3 ++ > drivers/cxl/core/mbox.c | 20 +++++++-- > drivers/cxl/core/mce.c | 65 +++++++++++++++++++++++++++ > drivers/cxl/core/mce.h | 20 +++++++++ > drivers/cxl/core/region.c | 114 +++++++++++++++++++++++++++++++++++++++++++++--- > drivers/cxl/core/trace.h | 31 ++++++++----- > drivers/cxl/cxl.h | 8 ++++ > drivers/cxl/cxlmem.h | 2 + > include/linux/acpi.h | 11 +++++ > include/linux/node.h | 7 +++ > tools/testing/cxl/Kbuild | 2 + > 18 files changed, 332 insertions(+), 21 deletions(-) > > base-commit: 0ad2507d5d93f39619fc42372c347d6006b64319