On Fri, 6 Sep 2019 at 09:33, Xiang Zheng <zhengxiang9@xxxxxxxxxx> wrote: > > From: Dongjiu Geng <gengdongjiu@xxxxxxxxxx> > > Add a SIGBUS signal handler. In this handler, it checks the SIGBUS type, > translates the host VA delivered by host to guest PA, then fills this PA > to guest APEI GHES memory, then notifies guest according to the SIGBUS > type. > > If guest accesses the poisoned memory, it generates Synchronous External > Abort(SEA). Then host kernel gets an APEI notification and calls > memory_failure() to unmapped the affected page in stage 2, finally > returns to guest. > > Guest continues to access PG_hwpoison page, it will trap to KVM as > stage2 fault, then a SIGBUS_MCEERR_AR synchronous signal is delivered to > Qemu, Qemu records this error address into guest APEI GHES memory and > notifes guest using Synchronous-External-Abort(SEA). > > Suggested-by: James Morse <james.morse@xxxxxxx> > Signed-off-by: Dongjiu Geng <gengdongjiu@xxxxxxxxxx> > Signed-off-by: Xiang Zheng <zhengxiang9@xxxxxxxxxx> > --- > hw/acpi/acpi_ghes.c | 252 ++++++++++++++++++++++++++++++++++++ > include/hw/acpi/acpi_ghes.h | 40 ++++++ > include/sysemu/kvm.h | 2 +- > target/arm/kvm64.c | 39 ++++++ > 4 files changed, 332 insertions(+), 1 deletion(-) I'll let somebody else review the ACPI parts as that's not my area of expertise, but I'll look at the target/arm parts below: > diff --git a/hw/acpi/acpi_ghes.c b/hw/acpi/acpi_ghes.c > index 20c45179ff..2d17c88045 100644 > --- a/hw/acpi/acpi_ghes.c > +++ b/hw/acpi/acpi_ghes.c > @@ -26,6 +26,168 @@ > #include "sysemu/sysemu.h" > #include "qemu/error-report.h" > > +/* Total size for Generic Error Status Block This block comment should start with '/*' on a line of its own (as should others in this patch). Usually checkpatch catches these but it's not infallible. > diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h > index 909bcd77cf..5f57e4ed43 100644 > --- a/include/sysemu/kvm.h > +++ b/include/sysemu/kvm.h > @@ -378,7 +378,7 @@ bool kvm_vcpu_id_is_valid(int vcpu_id); > /* Returns VCPU ID to be used on KVM_CREATE_VCPU ioctl() */ > unsigned long kvm_arch_vcpu_id(CPUState *cpu); > > -#ifdef TARGET_I386 > +#if defined(TARGET_I386) || defined(TARGET_AARCH64) > #define KVM_HAVE_MCE_INJECTION 1 > void kvm_arch_on_sigbus_vcpu(CPUState *cpu, int code, void *addr); > #endif Rather than introducing a new ifdef with lots of TARGET_*, I think it would be better to have target/i386/cpu.h and target/arm/cpu.h do "#define KVM_HAVE_MCE_INJECTION 1" (nb that the arm cpu.h needs to define it only for aarch64, not for 32-bit arm host compiles). and then kvm.h can just do #ifdef KVM_HAVE_MCE_INJECTION void kvm_arch_on_sigbus_vcpu(CPUState *cpu, int code, void *addr); #endif > diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c > index bf6edaa3f6..186d855522 100644 > --- a/target/arm/kvm64.c > +++ b/target/arm/kvm64.c > @@ -28,6 +28,8 @@ > #include "kvm_arm.h" > #include "hw/boards.h" > #include "internals.h" > +#include "hw/acpi/acpi.h" > +#include "hw/acpi/acpi_ghes.h" > > static bool have_guest_debug; > > @@ -1070,6 +1072,43 @@ int kvm_arch_get_registers(CPUState *cs) > return ret; > } > > +void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr) > +{ > + ram_addr_t ram_addr; > + hwaddr paddr; > + > + assert(code == BUS_MCEERR_AR || code == BUS_MCEERR_AO); > + > + if (acpi_enabled && addr && > + object_property_get_bool(qdev_get_machine(), "ras", NULL)) { > + ram_addr = qemu_ram_addr_from_host(addr); > + if (ram_addr != RAM_ADDR_INVALID && > + kvm_physical_memory_addr_from_host(c->kvm_state, addr, &paddr)) { > + kvm_hwpoison_page_add(ram_addr); > + /* Asynchronous signal will be masked by main thread, so > + * only handle synchronous signal. > + */ I don't entirely understand this comment. The x86 version of this function says: /* If we get an action required MCE, it has been injected by KVM * while the VM was running. An action optional MCE instead should * be coming from the main thread, which qemu_init_sigbus identifies * as the "early kill" thread. */ so we can be called for action-optional MCE here (not on the vcpu thread). We obviously can't deliver those as a synchronous exception to a particular CPU, but is there no mechanism for reporting them to the guest at all? > + if (code == BUS_MCEERR_AR) { > + kvm_cpu_synchronize_state(c); > + if (ACPI_GHES_CPER_FAIL != > + acpi_ghes_record_errors(ACPI_GHES_NOTIFY_SEA, paddr)) { > + kvm_inject_arm_sea(c); > + } else { > + fprintf(stderr, "failed to record the error\n"); > + } > + } > + return; > + } > + fprintf(stderr, "Hardware memory error for memory used by " > + "QEMU itself instead of guest system!\n"); > + } > + > + if (code == BUS_MCEERR_AR) { > + fprintf(stderr, "Hardware memory error!\n"); > + exit(1); > + } > +} > + > /* C6.6.29 BRK instruction */ > static const uint32_t brk_insn = 0xd4200000; > thanks -- PMM