On 09/07/2009 11:32 AM, Huang Ying wrote:
UCR (uncorrected recovery) MCE is supported in recent Intel CPUs, where some hardware error such as some memory error can be reported without PCC (processor context corrupted). To recover from such MCE, the corresponding memory will be unmapped, and all processes accessing the memory will be killed via SIGBUS. For KVM, if QEMU/KVM is killed, all guest processes will be killed too. So we relay SIGBUS from host OS to guest system via a UCR MCE injection. Then guest OS can isolate corresponding memory and kill necessary guest processes only. SIGBUS sent to main thread (not VCPU threads) will be broadcast to all VCPU threads as UCR MCE.
Won't the guest be confused by the broadcast? How does real hardware work?
+static void sigbus_handler(int n, struct signalfd_siginfo *siginfo, void *ctx) +{ + if (siginfo->ssi_code == BUS_MCEERR_AO) { + uint64_t status; + unsigned long paddr; + CPUState *cenv; + + /* Hope we are lucky for AO MCE */ + if (kvm_addr_userspace_to_phys((unsigned long)siginfo->ssi_addr, +&paddr)) { + fprintf(stderr, "Hardware memory error for memory used by " + "QEMU itself instead of guest system!: %llx\n", + (unsigned long long)siginfo->ssi_addr); + return; + } + status = MCI_STATUS_VAL | MCI_STATUS_UC | MCI_STATUS_EN + | MCI_STATUS_MISCV | MCI_STATUS_ADDRV | MCI_STATUS_S + | 0xc0; + kvm_inject_x86_mce(first_cpu, 9, status, + MCG_STATUS_MCIP | MCG_STATUS_RIPV, paddr, + (MCM_ADDR_PHYS<< 6) | 0xc);
This is a vcpu ioctl, yes? if so it must be called from the vcpu thread. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html