Re: [x86 RAS question] what is the behavior that guest happen machine check exceptions(MCEs) on page table

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Correct something, send again.


> 
> The host will deliver SIGBUS even in this case; the host doesn't know what the page is used for.

Paolo,
   Thanks for the reply.

I am afraid it will not be delivered.

Host has two chance to deliver SIGBUS: one is in memory error handler(memory_failure()); Another is in KVM.

When hardware detects a memory error on a stage1 page table page (f.e. memory scrubbing running in background), MCE SRAO is triggered, And guest trap to host, then the host kernel kicks memory error handler.
But memory error handler(memory_failure()) does nothing except that set this page table address to poisoned flag. because there's currently no way to isolate the page table page. the main problem should be that no one easily knows "which processes owned the page table page."
So the error page is still open for access, so here it does not deliver SIGBUS.[1]

then later some CPU try to access the stage1 page table page, which triggers severer MCE SRAR, guest trap to host again, KVM will judge whether the error physical address has poisoned flag, if having, KVM will deliver BUS_MCEERR_AR SIGBUS. For this page table RAS error, I am afraid the error address that KVM use to judge is the target address that want translation to,[2], not the page table itself address. So may be SIGBUS is not delivered. 

For example:  
When guest GVA "A" translate to HPA "B", it go through this GVA->IPA->HPA steps.
If the translate is fault, such as page table error, KVM finally get the fault HPA address "B" and use it to judge, 
in fact the error address is the page table address, not "B"


[1] https://lkml.org/lkml/2017/10/30/773
[2] the pfn should be not page table address, when happen stage2 page table RAS error.

static int kvm_handle_bad_page(struct kvm_vcpu *vcpu, gfn_t gfn, kvm_pfn_t pfn) 
{
    .........
	if (pfn == KVM_PFN_ERR_HWPOISON) {
		kvm_send_hwpoison_signal(kvm_vcpu_gfn_to_hva(vcpu, gfn), current);
		return 0;
	}

	return -EFAULT;
}

> 
> If the MCE happens on the *host* page tables, I think QEMU will be killed.

Yes, if MCE happens on Qemu itself instread of guest, Qemu will be killed.

> 
> Paolo
> 
> > then guest will not know it happen error, for this case, what is the 
> > behavior for KVM, terminate the VM? I do not find the handling logic.
> > Thanks! look forward your to reply.
> >
> >





[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux