On 2/28/24 12:00 AM, Dan Williams wrote: > Ben Cheatham wrote: >> This patch had an outdated commit message, so here's the patch with an updated description. >> >> I also realized that I was wrong about letting CXL 2.0+ error types (discussed a revision >> or two ago) and I wasn't actually letting them through. I've went ahead and added >> the ability to inject CXL 2.0+ error through the legacy interface. This pretty >> much amounts to returning an error for CXL 1.0/1.1 injection types in einj_error_inject() >> and instead routing them through a new einj_cxl_rch_error_inject() function called >> in einj-cxl.c >> >> If this change is too big I can send in another revision, I just wanted to avoid >> spamming the list(s). >> >> From eea1cf991dc2a551f6db2e3bb9510ed43c86762d Mon Sep 17 00:00:00 2001 >> From: Ben Cheatham <Benjamin.Cheatham@xxxxxxx> >> Date: Fri, 16 Feb 2024 11:12:51 -0600 >> Subject: [PATCH v14 2/4] EINJ: Add CXL error type support >> >> Move CXL protocol error types from einj.c (now einj-core.c) to einj-cxl.c. >> einj-cxl.c implements the necessary handling for CXL protocol error >> injection and exposes an API for the CXL core to use said functionality, >> while also allowing the EINJ module to be built without CXL support. >> Because CXL error types targeting CXL 1.0/1.1 ports require special >> handling, only allow them to be injected through the new cxl debugfs >> interface (next commit) and return an error when attempting to inject >> through the legacy interface. >> >> Signed-off-by: Ben Cheatham <Benjamin.Cheatham@xxxxxxx> >> --- >> MAINTAINERS | 1 + >> drivers/acpi/apei/Kconfig | 12 +++ >> drivers/acpi/apei/Makefile | 2 + >> drivers/acpi/apei/apei-internal.h | 18 ++++ >> drivers/acpi/apei/{einj.c => einj-core.c} | 85 +++++++++++---- >> drivers/acpi/apei/einj-cxl.c | 121 ++++++++++++++++++++++ >> include/linux/einj-cxl.h | 40 +++++++ >> 7 files changed, 257 insertions(+), 22 deletions(-) >> rename drivers/acpi/apei/{einj.c => einj-core.c} (94%) >> create mode 100644 drivers/acpi/apei/einj-cxl.c >> create mode 100644 include/linux/einj-cxl.h >> >> diff --git a/MAINTAINERS b/MAINTAINERS >> index 2ecaaec6a6bf..90cf8403dd17 100644 >> --- a/MAINTAINERS >> +++ b/MAINTAINERS >> @@ -5289,6 +5289,7 @@ M: Dan Williams <dan.j.williams@xxxxxxxxx> >> L: linux-cxl@xxxxxxxxxxxxxxx >> S: Maintained >> F: drivers/cxl/ >> +F: include/linux/cxl-einj.h >> F: include/linux/cxl-event.h >> F: include/uapi/linux/cxl_mem.h >> F: tools/testing/cxl/ >> diff --git a/drivers/acpi/apei/Kconfig b/drivers/acpi/apei/Kconfig >> index 6b18f8bc7be3..f01afa2805be 100644 >> --- a/drivers/acpi/apei/Kconfig >> +++ b/drivers/acpi/apei/Kconfig >> @@ -60,6 +60,18 @@ config ACPI_APEI_EINJ >> mainly used for debugging and testing the other parts of >> APEI and some other RAS features. >> >> +config ACPI_APEI_EINJ_CXL >> + bool "CXL Error INJection Support" >> + default ACPI_APEI_EINJ >> + depends on ACPI_APEI_EINJ && CXL_BUS <= ACPI_APEI_EINJ >> + help >> + Support for CXL protocol Error INJection through debugfs/cxl. >> + Availability and which errors are supported is dependent on >> + the host platform. Look to ACPI v6.5 section 18.6.4 and kernel >> + EINJ documentation for more information. >> + >> + If unsure say 'n' >> + >> config ACPI_APEI_ERST_DEBUG >> tristate "APEI Error Record Serialization Table (ERST) Debug Support" >> depends on ACPI_APEI >> diff --git a/drivers/acpi/apei/Makefile b/drivers/acpi/apei/Makefile >> index 4dfac2128737..2c474e6477e1 100644 >> --- a/drivers/acpi/apei/Makefile >> +++ b/drivers/acpi/apei/Makefile >> @@ -2,6 +2,8 @@ >> obj-$(CONFIG_ACPI_APEI) += apei.o >> obj-$(CONFIG_ACPI_APEI_GHES) += ghes.o >> obj-$(CONFIG_ACPI_APEI_EINJ) += einj.o >> +einj-y := einj-core.o >> +einj-$(CONFIG_ACPI_APEI_EINJ_CXL) += einj-cxl.o >> obj-$(CONFIG_ACPI_APEI_ERST_DEBUG) += erst-dbg.o >> >> apei-y := apei-base.o hest.o erst.o bert.o >> diff --git a/drivers/acpi/apei/apei-internal.h b/drivers/acpi/apei/apei-internal.h >> index 67c2c3b959e1..cd2766c69d78 100644 >> --- a/drivers/acpi/apei/apei-internal.h >> +++ b/drivers/acpi/apei/apei-internal.h >> @@ -130,4 +130,22 @@ static inline u32 cper_estatus_len(struct acpi_hest_generic_status *estatus) >> } >> >> int apei_osc_setup(void); >> + >> +int einj_get_available_error_type(u32 *type); >> +int einj_error_inject(u32 type, u32 flags, u64 param1, u64 param2, u64 param3, >> + u64 param4); >> +int einj_cxl_rch_error_inject(u32 type, u32 flags, u64 param1, u64 param2, >> + u64 param3, u64 param4); >> +bool einj_is_cxl_error_type(u64 type); >> +int einj_validate_error_type(u64 type); >> + >> +#ifndef ACPI_EINJ_CXL_CACHE_CORRECTABLE >> +#define ACPI_EINJ_CXL_CACHE_CORRECTABLE BIT(12) >> +#define ACPI_EINJ_CXL_CACHE_UNCORRECTABLE BIT(13) >> +#define ACPI_EINJ_CXL_CACHE_FATAL BIT(14) >> +#define ACPI_EINJ_CXL_MEM_CORRECTABLE BIT(15) >> +#define ACPI_EINJ_CXL_MEM_UNCORRECTABLE BIT(16) >> +#define ACPI_EINJ_CXL_MEM_FATAL BIT(17) >> +#endif >> + >> #endif >> diff --git a/drivers/acpi/apei/einj.c b/drivers/acpi/apei/einj-core.c >> similarity index 94% >> rename from drivers/acpi/apei/einj.c >> rename to drivers/acpi/apei/einj-core.c >> index 937c69844dac..437c13949be7 100644 >> --- a/drivers/acpi/apei/einj.c >> +++ b/drivers/acpi/apei/einj-core.c >> @@ -37,6 +37,12 @@ >> #define MEM_ERROR_MASK (ACPI_EINJ_MEMORY_CORRECTABLE | \ >> ACPI_EINJ_MEMORY_UNCORRECTABLE | \ >> ACPI_EINJ_MEMORY_FATAL) >> +#define CXL_ERROR_MASK (ACPI_EINJ_CXL_CACHE_CORRECTABLE | \ >> + ACPI_EINJ_CXL_CACHE_UNCORRECTABLE | \ >> + ACPI_EINJ_CXL_CACHE_FATAL | \ >> + ACPI_EINJ_CXL_MEM_CORRECTABLE | \ >> + ACPI_EINJ_CXL_MEM_UNCORRECTABLE | \ >> + ACPI_EINJ_CXL_MEM_FATAL) >> >> /* >> * ACPI version 5 provides a SET_ERROR_TYPE_WITH_ADDRESS action. >> @@ -141,7 +147,7 @@ static DEFINE_MUTEX(einj_mutex); >> /* >> * Exported APIs use this flag to exit early if einj_probe() failed. >> */ >> -static bool einj_initialized __ro_after_init; >> +bool einj_initialized __ro_after_init; >> >> static void *einj_param; >> >> @@ -166,7 +172,7 @@ static int __einj_get_available_error_type(u32 *type) >> } >> >> /* Get error injection capabilities of the platform */ >> -static int einj_get_available_error_type(u32 *type) >> +int einj_get_available_error_type(u32 *type) >> { >> int rc; >> >> @@ -536,8 +542,8 @@ static int __einj_error_inject(u32 type, u32 flags, u64 param1, u64 param2, >> } >> >> /* Inject the specified hardware error */ >> -static int einj_error_inject(u32 type, u32 flags, u64 param1, u64 param2, >> - u64 param3, u64 param4) >> +int einj_error_inject(u32 type, u32 flags, u64 param1, u64 param2, u64 param3, >> + u64 param4) >> { >> int rc; >> u64 base_addr, size; >> @@ -560,8 +566,18 @@ static int einj_error_inject(u32 type, u32 flags, u64 param1, u64 param2, >> if (type & ACPI5_VENDOR_BIT) { >> if (vendor_flags != SETWA_FLAGS_MEM) >> goto inject; >> - } else if (!(type & MEM_ERROR_MASK) && !(flags & SETWA_FLAGS_MEM)) >> + } else if (!(type & MEM_ERROR_MASK) && !(flags & SETWA_FLAGS_MEM)) { >> goto inject; >> + } >> + >> + /* >> + * Injections targeting a CXL 1.0/1.1 port have to be injected >> + * from the CXL debugfs interface so that we can guarantee a >> + * correct MMIO address. >> + */ > > Given that the CXL debugfs is not present in this file I would update > this comment to give a hints using local references. Something like: > > /* > * Injections targeting a CXL 1.0/1.1 port have to be injected > * via the einj_cxl_rch_error_inject() path as that does proper > * input validation that passed address is an RCRB base address. > */ > I agree, I'll change it to something along those lines. > ...that said, how does this work for the CXL 2.0 path? Does not > einj_cxl_inject_error() need to use __einj_error_inject() rather than > einj_error_inject() to get by this check? > The way that the target port type (1.0 vs. 2.0+) is differentiated is whether we are targeting a memory address, which is what SETWA_FLAGS_MEM indicates. So for the CXL 2.0+ path, the check will pass. Although the error type is a CXL error type, the flags variable will be set to SETWA_FLAGS_SBDF (I'm 90% sure that's what it's called, but it's something along those lines). Let me know if that doesn't answer your question!