Hi James, On Thu, Dec 12, 2019 at 03:34:31PM +0000, James Morse wrote: > Hi Marc, > > On 12/12/2019 11:33, Marc Zyngier wrote: > > On 2019-12-11 16:56, Marc Zyngier wrote: [...] > > (allocating from a kmemcache while holding current's mmap_sem. I don't want to think about > it!) > > Can we be lazier? We want the VMA to get the size of the poisoned mapping correct in the > signal. The bug is that this could change when we drop the lock, before queuing the > signal, so we report hwpoison on old-vmas:pfn with new-vmas:size. > > Can't it equally change when we drop the lock after queuing the signal? Any time before > the thread returns to user-space to take the signal gives us a stale value. > > I think all that matters is the size goes with the pfn that was poisoned. If we look the > vma up by hva again, we have to check if the pfn has changed too... (which you are doing) > > Can we stash the size in the existing mmap_sem region, and use that in > kvm_send_hwpoison_signal()? We know it matches the pfn we saw as poisoned. > > The vma could be changed before/after we send the signal, but user-space can't know which. > This is user-spaces' problem for messing with the memslots while a vpcu is running. > (I should clearly have expanded this thread before I replied to the original patch...) > > How about (untested): > -------------------------%<------------------------- > diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c > index 38b4c910b6c3..80212d4935bd 100644 > --- a/virt/kvm/arm/mmu.c > +++ b/virt/kvm/arm/mmu.c > @@ -1591,16 +1591,8 @@ static void invalidate_icache_guest_page(kvm_pfn_t pfn, unsigned > long size) > __invalidate_icache_guest_page(pfn, size); > } > > -static void kvm_send_hwpoison_signal(unsigned long address, > - struct vm_area_struct *vma) > +static void kvm_send_hwpoison_signal(unsigned long address, short lsb) > { > - short lsb; > - > - if (is_vm_hugetlb_page(vma)) > - lsb = huge_page_shift(hstate_vma(vma)); > - else > - lsb = PAGE_SHIFT; > - > send_sig_mceerr(BUS_MCEERR_AR, (void __user *)address, lsb, current); > } > > @@ -1673,6 +1665,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > struct kvm *kvm = vcpu->kvm; > struct kvm_mmu_memory_cache *memcache = &vcpu->arch.mmu_page_cache; > struct vm_area_struct *vma; > + short stage1_vma_size; > kvm_pfn_t pfn; > pgprot_t mem_type = PAGE_S2; > bool logging_active = memslot_is_logging(memslot); > > @@ -1703,6 +1696,12 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > vma_pagesize = PAGE_SIZE; > } > > + /* For signals due to hwpoison, we need to use the stage1 size */ > + if (is_vm_hugetlb_page(vma)) > + stage1_vma_size = huge_page_shift(hstate_vma(vma)); > + else > + stage1_vma_size = PAGE_SHIFT; > + But (see my patch) as far as I can tell, this is already what we have in vma_pagesize, and do we really have to provide the stage 1 size to user space if the fault happened within a smaller boundary? Isn't that just providing more precise information to the user? Thanks, Christoffer