On Mon, Feb 27, 2023 at 01:03:14PM +0800, Shuai Xue wrote: > There are two major types of uncorrected recoverable (UCR) errors : > > - Action Required (AR): The error is detected and the processor already > consumes the memory. OS requires to take action (for example, offline > failure page/kill failure thread) to recover this uncorrectable error. > > - Action Optional (AO): The error is detected out of processor execution > context. Some data in the memory are corrupted. But the data have not > been consumed. OS is optional to take action to recover this > uncorrectable error. > > The essential difference between AR and AO errors is that AR is a > synchronous event, while AO is an asynchronous event. The hardware will > signal a synchronous exception (Machine Check Exception on X86 and > Synchronous External Abort on Arm64) when an error is detected and the > memory access has been architecturally executed. > > When APEI firmware first is enabled, a platform may describe one error > source for the handling of synchronous errors (e.g. MCE or SEA notification > ), or for handling asynchronous errors (e.g. SCI or External Interrupt > notification). In other words, we can distinguish synchronous errors by > APEI notification. For AR errors, kernel will kill current process > accessing the poisoned page by sending SIGBUS with BUS_MCEERR_AR. In > addition, for AO errors, kernel will notify the process who owns the > poisoned page by sending SIGBUS with BUS_MCEERR_AO in early kill mode. > However, the GHES driver always sets mf_flags to 0 so that all UCR errors > are handled as AO errors in memory failure. > > To this end, set memory failure flags as MF_ACTION_REQUIRED on synchronous > events. > > Fixes: ba61ca4aab47 ("ACPI, APEI, GHES: Add hardware memory error recovery support")' > Signed-off-by: Shuai Xue <xueshuai@xxxxxxxxxxxxxxxxx> > --- > drivers/acpi/apei/ghes.c | 28 ++++++++++++++++++++++------ > 1 file changed, 22 insertions(+), 6 deletions(-) > > diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c > index 34ad071a64e9..5d37fb4bca67 100644 > --- a/drivers/acpi/apei/ghes.c > +++ b/drivers/acpi/apei/ghes.c > @@ -101,6 +101,19 @@ static inline bool is_hest_type_generic_v2(struct ghes *ghes) > return ghes->generic->header.type == ACPI_HEST_TYPE_GENERIC_ERROR_V2; > } > > +/* > + * A platform may describe one error source for the handling of synchronous > + * errors (e.g. MCE or SEA), or for handling asynchronous errors (e.g. SCI > + * or External Interrupt). > + */ > +static inline bool is_hest_sync_notify(struct ghes *ghes) > +{ > + int notify_type = ghes->generic->notify.type; > + > + return notify_type == ACPI_HEST_NOTIFY_SEA || > + notify_type == ACPI_HEST_NOTIFY_MCE; > +} This code seems to read that all MCEs are synchronous, which I think is not correct. The scenario I'm worrying about is that is_hest_sync_notify() returns true when this code is called for AO MCE (so asynchronous one). Then, ghes_do_memory_failure() (updated by your patch 2/2) will choose to use task_work instead of memory_failure_queue(). This should not be expected. Or does that never happen? - Naoya Horiguchi > + > /* > * This driver isn't really modular, however for the time being, > * continuing to use module_param is the easiest way to remain > @@ -477,7 +490,7 @@ static bool ghes_do_memory_failure(u64 physical_addr, int flags) > } > > static bool ghes_handle_memory_failure(struct acpi_hest_generic_data *gdata, > - int sev) > + int sev, bool sync) > { > int flags = -1; > int sec_sev = ghes_severity(gdata->error_severity); > @@ -491,7 +504,7 @@ static bool ghes_handle_memory_failure(struct acpi_hest_generic_data *gdata, > (gdata->flags & CPER_SEC_ERROR_THRESHOLD_EXCEEDED)) > flags = MF_SOFT_OFFLINE; > if (sev == GHES_SEV_RECOVERABLE && sec_sev == GHES_SEV_RECOVERABLE) > - flags = 0; > + flags = sync ? MF_ACTION_REQUIRED : 0; > > if (flags != -1) > return ghes_do_memory_failure(mem_err->physical_addr, flags); > @@ -499,12 +512,14 @@ static bool ghes_handle_memory_failure(struct acpi_hest_generic_data *gdata, > return false; > } > > -static bool ghes_handle_arm_hw_error(struct acpi_hest_generic_data *gdata, int sev) > +static bool ghes_handle_arm_hw_error(struct acpi_hest_generic_data *gdata, > + int sev, bool sync) > { > struct cper_sec_proc_arm *err = acpi_hest_get_payload(gdata); > bool queued = false; > int sec_sev, i; > char *p; > + int flags = sync ? MF_ACTION_REQUIRED : 0; > > log_arm_hw_error(err); > > @@ -526,7 +541,7 @@ static bool ghes_handle_arm_hw_error(struct acpi_hest_generic_data *gdata, int s > * and don't filter out 'corrected' error here. > */ > if (is_cache && has_pa) { > - queued = ghes_do_memory_failure(err_info->physical_fault_addr, 0); > + queued = ghes_do_memory_failure(err_info->physical_fault_addr, flags); > p += err_info->length; > continue; > } > @@ -647,6 +662,7 @@ static bool ghes_do_proc(struct ghes *ghes, > const guid_t *fru_id = &guid_null; > char *fru_text = ""; > bool queued = false; > + bool sync = is_hest_sync_notify(ghes); > > sev = ghes_severity(estatus->error_severity); > apei_estatus_for_each_section(estatus, gdata) { > @@ -664,13 +680,13 @@ static bool ghes_do_proc(struct ghes *ghes, > atomic_notifier_call_chain(&ghes_report_chain, sev, mem_err); > > arch_apei_report_mem_error(sev, mem_err); > - queued = ghes_handle_memory_failure(gdata, sev); > + queued = ghes_handle_memory_failure(gdata, sev, sync); > } > else if (guid_equal(sec_type, &CPER_SEC_PCIE)) { > ghes_handle_aer(gdata); > } > else if (guid_equal(sec_type, &CPER_SEC_PROC_ARM)) { > - queued = ghes_handle_arm_hw_error(gdata, sev); > + queued = ghes_handle_arm_hw_error(gdata, sev, sync); > } else { > void *err = acpi_hest_get_payload(gdata); > > -- > 2.20.1.12.g72788fdb