Re: [PATCH] mm/hugetlb: avoid get wrong ptep caused by race

Sean Christopherson <sean.j.christopherson@xxxxxxxxx> · Wed, 19 Feb 2020 08:22:31 -0800

On Wed, Feb 19, 2020 at 08:21:26PM +0800, Longpeng (Mike) wrote:
> 在 2020/2/19 9:58, Sean Christopherson 写道:
> > FWIW, I'd be in favor of going the READ/WRITE_ONCE() route for x86, e.g.
> > convert everything as a follow-up patch (or patches).  I'm fairly confident
> > that KVM's usage of lookup_address_in_mm() is safe, but I wouldn't exactly
> > bet my life on it.  I'd much rather the failing scenario be that KVM uses
> > a sub-optimal page size as opposed to exploding on a bad pointer.
> > 
> Um...our testcase starts 50 VMs with 2U4G(use 1G hugepage) and then do
> live-upgrade(private feature that just modify the qemu and libvirt) and
> live-migrate in turns for each one. However our live upgraded new QEMU won't do
> touch_all_pages.
> Suppose we start a VM without touch_all_pages in QEMU, the VM's guest memory is
> not mapped in the CR3 pagetable at the moment. When the 2 vcpus running, they
> could access some pages belong to the same 1G-hugepage, both of them will vmexit
> due to ept_violation and then call gup-->follow_hugetlb_page-->hugetlb_fault, so
> the race may encounter, right?

Yep.  The code I'm referring to is similar but different code that just
happened to go into KVM for kernel 5.6.  It has no effect on the gup() flow
that leads to this bug.  I mentioned it above as an example of code outside
of hugetlb_fault() that would also benefit from moving to READ/WRITE_ONCE().