On Wed, Dec 11, 2019 at 04:56:49PM +0000, Marc Zyngier wrote: > When we check for a poisoned page, we use the VMA to tell userspace > about the looming disaster. But we pass a pointer to this VMA > after having released the mmap_sem, which isn't a good idea. > > Instead, re-check that we have still have a VMA, and that this > VMA still points to a poisoned page. If the VMA isn't there, > userspace is playing with our nerves, so lety's give it a -EFAULT > (it deserves it). If the PFN isn't poisoned anymore, let's restart > from the top and handle the fault again. > > Signed-off-by: Marc Zyngier <maz@xxxxxxxxxx> > --- > virt/kvm/arm/mmu.c | 25 +++++++++++++++++++++++-- > 1 file changed, 23 insertions(+), 2 deletions(-) > > diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c > index 0b32a904a1bb..f73393f5ddb7 100644 > --- a/virt/kvm/arm/mmu.c > +++ b/virt/kvm/arm/mmu.c > @@ -1741,9 +1741,30 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > > pfn = gfn_to_pfn_prot(kvm, gfn, write_fault, &writable); > if (pfn == KVM_PFN_ERR_HWPOISON) { > - kvm_send_hwpoison_signal(hva, vma); > - return 0; > + /* > + * Search for the VMA again, as it may have been > + * removed in the interval... > + */ > + down_read(¤t->mm->mmap_sem); > + vma = find_vma_intersection(current->mm, hva, hva + 1); > + if (vma) { > + /* > + * Recheck for a poisoned page. If something changed > + * behind our back, don't do a thing and take the > + * fault again. > + */ > + pfn = gfn_to_pfn_prot(kvm, gfn, write_fault, &writable); > + if (pfn == KVM_PFN_ERR_HWPOISON) > + kvm_send_hwpoison_signal(hva, vma); > + > + ret = 0; > + } else { > + ret = -EFAULT; > + } > + up_read(¤t->mm->mmap_sem); > + return ret; > } > + > if (is_error_noslot_pfn(pfn)) > return -EFAULT; > > -- > 2.20.1 > If I read this code correctly, then all we use the VMA for is to find the page size used by the MMU to back the VMA, which we've already established in the vma_pagesize (and possibly adjusted to something more accurate based on our constraints in stage 2 which generated the error), so all we need is the size and a way to convert that into a shift. Not being 100% confident about the semantics of the lsb bit we pass to user space (is it indicating the size of the mapping which caused the error or the size of the mapping where user space could potentially trigger an error?), or wheter we care enough at that level, could we consider something like the following instead? diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c index 38b4c910b6c3..2509d9dec42d 100644 --- a/virt/kvm/arm/mmu.c +++ b/virt/kvm/arm/mmu.c @@ -1592,15 +1592,9 @@ static void invalidate_icache_guest_page(kvm_pfn_t pfn, unsigned long size) } static void kvm_send_hwpoison_signal(unsigned long address, - struct vm_area_struct *vma) + unsigned long vma_pagesize) { - short lsb; - - if (is_vm_hugetlb_page(vma)) - lsb = huge_page_shift(hstate_vma(vma)); - else - lsb = PAGE_SHIFT; - + short lsb = __ffs(vma_pagesize); send_sig_mceerr(BUS_MCEERR_AR, (void __user *)address, lsb, current); } @@ -1735,7 +1729,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, pfn = gfn_to_pfn_prot(kvm, gfn, write_fault, &writable); if (pfn == KVM_PFN_ERR_HWPOISON) { - kvm_send_hwpoison_signal(hva, vma); + kvm_send_hwpoison_signal(hva, vma_pagesize); return 0; } if (is_error_noslot_pfn(pfn)) Thanks, Christoffer