Hi James, Thanks for your comments. On 2017/3/6 18:00, James Morse wrote: > Hi Xie XiuQi, > > On 03/03/17 10:39, Xie XiuQi wrote: >> ARM APEI extension proposal added SEI (asynchronous SError interrupt) >> notification type for ARMv8. >> >> Add a new GHES error source handling function for SEI. In firmware >> first mode, if an error source's notification type is SEI. Then GHES >> could parse and report the detail error information. > > This patch doesn't apply to any upstream tree. Is this based on Tyler's larger > UEFI/ACPI update series? If so, please mention this in your cover letter, (Nit: > please include a cover letter when sending two or more patches!). > Yes, this patch is based on Tyler's series "[PATCH V11 00/10] Add UEFI 2.6 and ACPI 6.1 updates for RAS on ARM64" and linux-next 20170302. I'll add a cover letter next time, thanks. > What happens if the SError Interrupt arrives while KVM was doing its work? We > set the HCR_EL2.AMO bit when running a guest, so KVM may receive these instead > of the host kernel. > OK, I'll do it in next version. > >> diff --git a/drivers/acpi/apei/Kconfig b/drivers/acpi/apei/Kconfig >> index 1122d7f..a32f046 100644 >> --- a/drivers/acpi/apei/Kconfig >> +++ b/drivers/acpi/apei/Kconfig >> @@ -18,6 +18,20 @@ config HAVE_ACPI_APEI_SEA >> option allows the OS to look for such hardware error record, and >> take appropriate action. >> >> +config ACPI_APEI_SEI >> + bool "APEI Asynchronous SError Interrupt logging/recovering support" >> + depends on ARM64 && ACPI_APEI_GHES >> + help >> + This option should be enabled if the system supports >> + firmware first handling of SEI (asynchronous SError interrupt). >> + >> + SEI happens with invalid instruction access or asynchronous exceptions >> + on ARMv8 systems. If a system supports firmware first handling of SEI, >> + the platform analyzes and handles hardware error notifications from >> + SEI, and it may then form a HW error record for the OS to parse and >> + handle. This option allows the OS to look for such hardware error >> + record, and take appropriate action. >> + >> config ACPI_APEI >> bool "ACPI Platform Error Interface (APEI)" >> select MISC_FILESYSTEMS >> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c >> index 3e4ea1b..d084a09 100644 >> --- a/drivers/acpi/apei/ghes.c >> +++ b/drivers/acpi/apei/ghes.c >> @@ -850,6 +850,50 @@ static inline void ghes_sea_remove(struct ghes *ghes) >> } >> #endif /* CONFIG_HAVE_ACPI_APEI_SEA */ >> >> +#ifdef CONFIG_ACPI_APEI_SEI >> +static LIST_HEAD(ghes_sei); >> + >> +void ghes_notify_sei(void) >> +{ >> + struct ghes *ghes; >> + >> + /* >> + * synchronize_rcu() will wait for nmi_exit(), so no need to > > Where nmi_exit()? > > This nmi enter/exit was to prevent APEI being interrupted by APEI and trying to > take the same set of locks. APEI masks IRQs to prevent this happening normally, > but Synchronous External Abort couldn't be masked. > We don't mask Asynchronous Exceptions in APEI so the same thing can happen here. > Adding nmi_{enter,exit}() round the ghes call in the arch bad_mode() will > prevent this lockup. > Thank you for your detailed explanation, I'll add it in next version. Thanks, Xie XiuQi -- To unsubscribe from this list: send the line "unsubscribe linux-acpi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html