On Fri, 17 Jan 2020 at 10:05, gengdongjiu <gengdongjiu@xxxxxxxxxx> wrote: > > On 2020/1/17 0:28, Peter Maydell wrote: > > On Wed, 8 Jan 2020 at 11:33, Dongjiu Geng <gengdongjiu@xxxxxxxxxx> wrote: > > > >> +void kvm_arch_on_sigbus_vcpu(CPUState *c, int code, void *addr) > >> +{ > >> + ram_addr_t ram_addr; > >> + hwaddr paddr; > >> + > >> + assert(code == BUS_MCEERR_AR || code == BUS_MCEERR_AO); > >> + > >> + if (acpi_enabled && addr && > >> + object_property_get_bool(qdev_get_machine(), "ras", NULL)) { > >> + ram_addr = qemu_ram_addr_from_host(addr); > >> + if (ram_addr != RAM_ADDR_INVALID && > >> + kvm_physical_memory_addr_from_host(c->kvm_state, addr, &paddr)) { > >> + kvm_hwpoison_page_add(ram_addr); > >> + /* > >> + * Asynchronous signal will be masked by main thread, so > >> + * only handle synchronous signal. > >> + */ > > > > I don't understand this comment. (I think we've had discussions > > about it before, but it's still not clear to me.) > > > > This function (kvm_arch_on_sigbus_vcpu()) will be called in two contexts: > > > > (1) in the vcpu thread: > > * the real SIGBUS handler sigbus_handler() sets a flag and arranges > > for an immediate vcpu exit > > * the vcpu thread reads the flag on exit from KVM_RUN and > > calls kvm_arch_on_sigbus_vcpu() directly > > * the error could be MCEERR_AR or MCEERR_AOFor the vcpu thread, the error can be MCEERR_AR or MCEERR_AO, > but kernel/KVM usually uses MCEERR_AR(action required) instead of MCEERR_AO, because it needs do action immediately. For MCEERR_AO error, the action is optional and the error can be ignored. > At least I do not find Linux kernel/KVM deliver MCEERR_AO in the vcpu threads. > > > (2) MCE errors on other threads: > > * here SIGBUS is blocked, so MCEERR_AR (action-required) > > errors will cause the kernel to just kill the QEMU process > > * MCEERR_AO errors will be handled via the iothread's use > > of signalfd(), so kvm_on_sigbus() will get called from > > the main thread, and it will call kvm_arch_on_sigbus_vcpu() > > * in this case the passed in CPUState will (arbitrarily) be that > > for the first vCPU > > For the MCE errors on other threads, it can only handle MCEERR_AO. If it is MCEERR_AR, the QEMU will assert and exit[2]. > > Case1: Other APP indeed can send MCEERR_AO to QEMU, QEMU handle it via the iothread's use of signalfd() through above path. > Case2: But if the MCEERR_AO is delivered by kernel, I see QEMU ignore it because SIGBUS is masked in main thread[3], for this case, I do not see QEMU handle it via signalfd() for MCEERR_AO errors from my test. SIGBUS is blocked in the main thread because we use signalfd(). The function sigfd_handler() should be called and it will then manually invoke the correct function for the signal. > For Case1,I think we should not let guest know it, because it is not triggered by guest. only other APP send SIGBUS to tell QEMU do somethings. I don't understand what you mean here by "other app" or "guest" triggering of MCEERR. I thought that an MCEERR meant "the hardware has detected that there is a problem with the RAM". If there's a problem with the RAM and it's the RAM that's being used as guest RAM, we need to tell the guest, surely ? > For Case2,it does not call call kvm_arch_on_sigbus_vcpu(). It should do. The code you quote calls that function for that case: > [1]: > /* Called synchronously (via signalfd) in main thread. */ > int kvm_on_sigbus(int code, void *addr) > { > #ifdef KVM_HAVE_MCE_INJECTION > /* Action required MCE kills the process if SIGBUS is blocked. Because > * that's what happens in the I/O thread, where we handle MCE via signalfd, > * we can only get action optional here. > */ > [2]: assert(code != BUS_MCEERR_AR); > kvm_arch_on_sigbus_vcpu(first_cpu, code, addr); > return 0; > #else > return 1; > #endif > } > Above all, from my test, for MCEERR_AO error which is triggered by guest, it not call kvm_arch_on_sigbus_vcpu(). I'm not sure what you mean by "triggered by guest". I assume that exactly what kind of errors the kernel can report and when will depend to some extent on the underlying hardware/firmware implementation of reporting of memory errors, but in principle the ABI allows the kernel to send SIGBUS_(BUS_MCEERR_AO) to the main thread, the signal should be handled by signalfd, our code for working with multiple fds should mean that the main thread calls sigfd_handler() to deal with reading bytes from the signalfd fd, and that function should then call sigbus_handler(), which calls kvm_on_sigbus(), which calls kvm_arch_on_sigbus_vcpu(). If something in that code path is not working then we need to find out what it is. thanks -- PMM