Hi James, Can you review and help to merge this patch? Thanks, Shiju >-----Original Message----- >From: Rafael J. Wysocki [mailto:rafael@xxxxxxxxxx] >Sent: 05 February 2021 12:54 >To: Shiju Jose <shiju.jose@xxxxxxxxxx>; Borislav Petkov <bp@xxxxxxxxx>; >James Morse <james.morse@xxxxxxx> >Cc: open list:EDAC-CORE <linux-edac@xxxxxxxxxxxxxxx>; ACPI Devel Maling >List <linux-acpi@xxxxxxxxxxxxxxx>; Linux Kernel Mailing List <linux- >kernel@xxxxxxxxxxxxxxx>; Tony Luck <tony.luck@xxxxxxxxx>; Rafael J. Wysocki ><rjw@xxxxxxxxxxxxx>; Len Brown <lenb@xxxxxxxxxx>; Robert Richter ><rrichter@xxxxxxxxxxx>; Mauro Carvalho Chehab ><mchehab+huawei@xxxxxxxxxx>; linuxarm@xxxxxxxxxxxxx; xuwei (O) ><xuwei5@xxxxxxxxxx>; Jonathan Cameron ><jonathan.cameron@xxxxxxxxxx>; John Garry <john.garry@xxxxxxxxxx>; >tanxiaofei <tanxiaofei@xxxxxxxxxx>; Shameerali Kolothum Thodi ><shameerali.kolothum.thodi@xxxxxxxxxx>; Salil Mehta ><salil.mehta@xxxxxxxxxx> >Subject: Re: [PATCH v2 1/2] EDAC/ghes: Add EDAC device for reporting the >CPU cache errors > >On Fri, Jan 29, 2021 at 1:03 PM Shiju Jose <shiju.jose@xxxxxxxxxx> wrote: >> >> CPU L2 cache corrected errors are detected occasionally on few of our >> ARM64 hardware boards. Though it is rare, the probability of the CPU >> cache errors frequently occurring can't be avoided. The earlier >> failure detection by monitoring the cache corrected errors for the >> frequent occurrences and taking preventive action could prevent more >> serious hardware faults. >> >> On Intel architectures, cache corrected errors are reported and the >> affected cores are offlined in the architecture specific method. >> http://www.mcelog.org/cache.html >> >> However for the firmware-first error reporting, specifically on >> ARM64 architectures, there is no provision present for reporting the >> cache corrected error count to the user-space and taking preventive >> action such as offline the affected cores. >> >> For this purpose, it was suggested to create the CPU EDAC device for >> the CPU caches for reporting the cache error count for the >> firmware-first error reporting. >> The EDAC device blocks for the CPU caches would be created based on >> the cache information obtained from the cpu_cacheinfo. >> >> User-space application could monitor the recorded corrected error >> count for the earlier hardware failure detection and could take >> preventive action, such as offline the corresponding CPU core/s. >> >> Add an EDAC device and device blocks for the CPU caches based on the >> cache information from the cpu_cacheinfo. >> The cache's corrected error count would be stored in the >> /sys/devices/system/edac/cpu/cpu*/cache*/ce_count. >> >> Issues and possible solutions, >> 1.Cache info is not available for the CPUs offline. >> EDAC device interface requires creating EDAC device and device >> blocks together. It requires the number of caches per CPU as device >> blocks for the creation. >> However, this info is not available for the offlined CPUs. >> Tested Solution: Find the max number of caches among >> online CPUs, create the EDAC device for CPUs caches >> and get and populate the cache info for an offline >> CPU later, when the error is reported on that CPU for >> the first time. >> >> 2. Reporting error count for the Shared caches. >> There are few possible solutions, >> Tested Solution: >> Kernel would report a new error count for a shared cache >> through the EDAC device block for that CPU on which the error >> is reported. Then user-space application would sum the total >> error count from EDAC device block of all the CPUs in the >> shared CPU list of that shared cache. >> >> For the firmware-first error reporting, add an interface in the >> ghes_edac allow to report a CPU corrected error count. >> >> Suggested-by: James Morse <james.morse@xxxxxxx> >> Signed-off-by: Shiju Jose <shiju.jose@xxxxxxxxxx> > >Boris, James, I need your input here. > >> --- >> Documentation/ABI/testing/sysfs-devices-edac | 15 ++ >> drivers/acpi/apei/ghes.c | 8 +- >> drivers/edac/Kconfig | 12 ++ >> drivers/edac/ghes_edac.c | 186 +++++++++++++++++++ >> include/acpi/ghes.h | 27 +++ >> 5 files changed, 247 insertions(+), 1 deletion(-) >> >> diff --git a/Documentation/ABI/testing/sysfs-devices-edac >> b/Documentation/ABI/testing/sysfs-devices-edac >> index 256a9e990c0b..56a18b0af419 100644 >> --- a/Documentation/ABI/testing/sysfs-devices-edac >> +++ b/Documentation/ABI/testing/sysfs-devices-edac >> @@ -155,3 +155,18 @@ Description: This attribute file displays the >total count of uncorrectable >> errors that have occurred on this DIMM. If panic_on_ue is set, this >> counter will not have a chance to increment, since EDAC will panic >the >> system >> + >> +What: /sys/devices/system/edac/cpu/cpu*/cache*/ce_count >> +Date: December 2020 >> +Contact: linux-edac@xxxxxxxxxxxxxxx >> +Description: This attribute file displays the total count of correctable >> + errors that have occurred on this CPU cache. This count is very >important >> + to examine. CEs provide early indications that a cache is >beginning >> + to fail. This count field should be monitored for non-zero values >> + and report such information to the system administrator. >> + >> +What: /sys/devices/system/edac/cpu/cpu*/cache*/ue_count >> +Date: December 2020 >> +Contact: linux-edac@xxxxxxxxxxxxxxx >> +Description: This attribute file displays the total count of uncorrectable >> + errors that have occurred on this CPU cache. >> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index >> fce7ade2aba9..139540f2c8f4 100644 >> --- a/drivers/acpi/apei/ghes.c >> +++ b/drivers/acpi/apei/ghes.c >> @@ -1452,4 +1452,10 @@ static int __init ghes_init(void) >> err: >> return rc; >> } >> -device_initcall(ghes_init); >> + >> +/* >> + * device_initcall_sync() is added instead of the device_initcall() >> + * because the CPU cacheinfo should be populated and is required for >> + * adding the CPU cache edac device in the ghes_edac_register(). >> + */ >> +device_initcall_sync(ghes_init); >> diff --git a/drivers/edac/Kconfig b/drivers/edac/Kconfig index >> 81c42664f21b..39fb53aa9cd9 100644 >> --- a/drivers/edac/Kconfig >> +++ b/drivers/edac/Kconfig >> @@ -74,6 +74,18 @@ config EDAC_GHES >> >> In doubt, say 'Y'. >> >> +config EDAC_GHES_CPU_CACHE_ERROR >> + bool "EDAC device for reporting firmware-first BIOS detected CPU >cache error count" >> + depends on EDAC_GHES >> + depends on (ARM64 || COMPILE_TEST) >> + help >> + EDAC device for the firmware-first BIOS detected CPU cache error >count, >> + reported via ACPI APEI/GHES. By enabling this option, EDAC device >for >> + the CPU hierarchy and edac device blocks for the caches would be >created. >> + The cache error count is shared with the userspace via the CPU EDAC >> + device's sysfs interface. This option is architecure independent >though >> + currently it is tested and enabled for ARM64 only. >> + >> config EDAC_AMD64 >> tristate "AMD64 (Opteron, Athlon64)" >> depends on AMD_NB && EDAC_DECODE_MCE diff --git >> a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c index >> 6d1ddecbf0da..400b50be0c33 100644 >> --- a/drivers/edac/ghes_edac.c >> +++ b/drivers/edac/ghes_edac.c >> @@ -12,6 +12,9 @@ >> #include <acpi/ghes.h> >> #include <linux/edac.h> >> #include <linux/dmi.h> >> +#if defined(CONFIG_EDAC_GHES_CPU_CACHE_ERROR) >> +#include <linux/cacheinfo.h> >> +#endif >> #include "edac_module.h" >> #include <ras/ras_event.h> >> >> @@ -57,6 +60,21 @@ module_param(force_load, bool, 0); >> >> static bool system_scanned; >> >> +#if defined(CONFIG_EDAC_GHES_CPU_CACHE_ERROR) >> +struct ghes_edac_cpu_block { >> + int cpu; >> + u8 level; >> + u8 type; >> + int block_nr; >> + bool info_populated; >> +}; >> + >> +static struct ghes_edac_cpu_block __percpu *edac_cpu_block_list; >> + >> +static struct edac_device_ctl_info *cpu_edac_dev; static unsigned int >> +num_caches_per_cpu; #endif >> + >> /* Memory Device - Type 17 of SMBIOS spec */ struct >memdev_dmi_entry >> { >> u8 type; >> @@ -225,6 +243,164 @@ static void enumerate_dimms(const struct >dmi_header *dh, void *arg) >> hw->num_dimms++; >> } >> >> +#if defined(CONFIG_EDAC_GHES_CPU_CACHE_ERROR) >> +static void ghes_edac_get_cpu_cacheinfo(void) { >> + int cpu; >> + struct cpu_cacheinfo *this_cpu_ci; >> + >> + /* >> + * Find the maximum number of caches present in the CPU heirarchy >> + * among the online CPUs. >> + */ >> + for_each_online_cpu(cpu) { >> + this_cpu_ci = get_cpu_cacheinfo(cpu); >> + if (!this_cpu_ci) >> + continue; >> + if (num_caches_per_cpu < this_cpu_ci->num_leaves) >> + num_caches_per_cpu = this_cpu_ci->num_leaves; >> + } >> +} >> + >> +static int ghes_edac_add_cpu_device(struct device *dev) { >> + int rc; >> + >> + cpu_edac_dev = edac_device_alloc_ctl_info(0, "cpu", >num_possible_cpus(), >> + "cache", num_caches_per_cpu, 0, NULL, >> + 0, edac_device_alloc_index()); >> + if (!cpu_edac_dev) { >> + pr_warn("ghes-edac cpu edac device registration failed\n"); >> + return -ENOMEM; >> + } >> + >> + cpu_edac_dev->dev = dev; >> + cpu_edac_dev->ctl_name = "cpu_edac_dev"; >> + cpu_edac_dev->dev_name = "ghes"; >> + cpu_edac_dev->mod_name = "ghes_edac.c"; >> + rc = edac_device_add_device(cpu_edac_dev); >> + if (rc > 0) { >> + pr_warn("edac_device_add_device failed\n"); >> + edac_device_free_ctl_info(cpu_edac_dev); >> + return -ENOMEM; >> + } >> + >> + return 0; >> +} >> + >> +static void ghes_edac_delete_cpu_device(void) { >> + num_caches_per_cpu = 0; >> + if (cpu_edac_dev) { >> + edac_device_del_device(cpu_edac_dev->dev); >> + edac_device_free_ctl_info(cpu_edac_dev); >> + } >> + free_percpu(edac_cpu_block_list); } >> + >> +static int ghes_edac_populate_cache_info(int cpu) { >> + struct ghes_edac_cpu_block *block; >> + struct cpu_cacheinfo *this_cpu_ci; >> + struct cacheinfo *this_leaf; >> + int i, num_caches; >> + >> + this_cpu_ci = get_cpu_cacheinfo(cpu); >> + if (!this_cpu_ci || !this_cpu_ci->info_list || !this_cpu_ci- >>num_leaves) >> + return -EINVAL; >> + >> + this_leaf = this_cpu_ci->info_list; >> + /* >> + * Cache info would not be available for a CPU which is offline. >However EDAC device >> + * creation requires the number of device blocks (for example max >number of caches >> + * among CPUs). The cache info in the edac_cpu_block_list would be >populated when >> + * the first error is reported on that cpu. Thus we need to restrict the >number >> + * of caches if the CPU's num_leaves exceed the max number of caches >per cpu >> + * calculated in the init time. >> + */ >> + num_caches = min(num_caches_per_cpu, this_cpu_ci->num_leaves); >> + >> + /* >> + * The edac CPU cache device blocks entries in the sysfs should match >with the >> + * CPU cache structure in the sysfs so that the affected cpus for a >shared cache >> + * can be easily extracted in the userspace. >> + */ >> + block = per_cpu_ptr(edac_cpu_block_list, cpu); >> + for (i = 0; i < num_caches; i++) { >> + block->cpu = cpu; >> + block->level = this_leaf->level; >> + block->type = this_leaf->type; >> + block->block_nr = i; >> + block->info_populated = true; >> + this_leaf++; >> + block++; >> + } >> + >> + return 0; >> +} >> + >> +static void ghes_edac_create_cpu_device(struct device *dev) { >> + int cpu; >> + >> + if (!num_caches_per_cpu) >> + return; >> + >> + /* >> + * Allocate EDAC CPU cache list. >> + * EDAC device interface require creating the CPU cache hierarchy for >all >> + * the CPUs together. Thus need to allocate edac_cpu_block_list for >the >> + * maximum number of caches per cpu among all the CPUs >irrespective of >> + * the number of caches per CPU might vary. >> + */ >> + edac_cpu_block_list = __alloc_percpu(sizeof(struct >ghes_edac_cpu_block) * num_caches_per_cpu, >> + __alignof__(u64)); >> + if (!edac_cpu_block_list) >> + return; >> + >> + if (ghes_edac_add_cpu_device(dev)) { >> + ghes_edac_delete_cpu_device(); >> + return; >> + } >> + >> + /* >> + * Populate EDAC CPU cache list with cache's information. >> + */ >> + for_each_online_cpu(cpu) >> + ghes_edac_populate_cache_info(cpu); >> +} >> + >> +void ghes_edac_report_cpu_error(struct ghes_einfo_cpu *einfo) { >> + struct ghes_edac_cpu_block *block; >> + int i; >> + >> + if (!einfo || !(einfo->ce_count) || !num_caches_per_cpu) >> + return; >> + >> + /* >> + * EDAC device require the number of device blocks (for example max >number of caches >> + * among CPUs) during the creation. For the CPUs that were offline in >the cpu edac >> + * init and become online later, the cache info would be populated >when the first >> + * error is reported on that cpu. >> + */ >> + block = per_cpu_ptr(edac_cpu_block_list, einfo->cpu); >> + if (!block->info_populated) { >> + if (ghes_edac_populate_cache_info(einfo->cpu)) >> + return; >> + } >> + >> + for (i = 0; i < num_caches_per_cpu; i++) { >> + if ((block->level == einfo->cache_level) && (block->type == einfo- >>cache_type)) { >> + edac_device_handle_ce_count(cpu_edac_dev, einfo- >>ce_count, >> + einfo->cpu, block->block_nr, ""); >> + break; >> + } >> + block++; >> + } >> +} >> +#endif >> + >> static void ghes_scan_system(void) >> { >> if (system_scanned) >> @@ -232,6 +408,8 @@ static void ghes_scan_system(void) >> >> dmi_walk(enumerate_dimms, &ghes_hw); >> >> + ghes_edac_get_cpu_cacheinfo(); >> + >> system_scanned = true; >> } >> >> @@ -620,6 +798,10 @@ int ghes_edac_register(struct ghes *ghes, struct >device *dev) >> goto unlock; >> } >> >> +#if defined(CONFIG_EDAC_GHES_CPU_CACHE_ERROR) >> + ghes_edac_create_cpu_device(dev); #endif >> + >> spin_lock_irqsave(&ghes_lock, flags); >> ghes_pvt = pvt; >> spin_unlock_irqrestore(&ghes_lock, flags); @@ -669,6 +851,10 >> @@ void ghes_edac_unregister(struct ghes *ghes) >> if (mci) >> edac_mc_free(mci); >> >> +#if defined(CONFIG_EDAC_GHES_CPU_CACHE_ERROR) >> + ghes_edac_delete_cpu_device(); #endif >> + >> unlock: >> mutex_unlock(&ghes_reg_mutex); } diff --git >> a/include/acpi/ghes.h b/include/acpi/ghes.h index >> 34fb3431a8f3..e019ad88fdc3 100644 >> --- a/include/acpi/ghes.h >> +++ b/include/acpi/ghes.h >> @@ -73,6 +73,24 @@ void ghes_unregister_vendor_record_notifier(struct >> notifier_block *nb); >> >> int ghes_estatus_pool_init(int num_ghes); >> >> +/** >> + * struct ghes_einfo_cpu - structure to pass the CPU error info to >> +the edac >> + * @cpu: CPU index. >> + * @error_type: error type, cache/TLB/bus/ etc. >> + * @cache_level: cache level. >> + * @cache_type: ACPI cache type. >> + * @ue_count: CPU uncorrectable error count. >> + * @ce_count: CPU correctable error count. >> + */ >> +struct ghes_einfo_cpu { >> + int cpu; >> + u8 error_type; >> + u8 cache_level; >> + u8 cache_type; >> + u16 ue_count; >> + u16 ce_count; >> +}; >> + >> /* From drivers/edac/ghes_edac.c */ >> >> #ifdef CONFIG_EDAC_GHES >> @@ -98,6 +116,15 @@ static inline void ghes_edac_unregister(struct >> ghes *ghes) } #endif >> >> +#ifdef CONFIG_EDAC_GHES_CPU_CACHE_ERROR void >> +ghes_edac_report_cpu_error(struct ghes_einfo_cpu *einfo_cpu); >> + >> +#else >> +static inline void ghes_edac_report_cpu_error(struct ghes_einfo_cpu >> +*einfo_cpu) { } #endif >> + >> static inline int acpi_hest_get_version(struct acpi_hest_generic_data >> *gdata) { >> return gdata->revision >> 8; >> -- >> 2.17.1 >>