On 09/30/22 23:01, Peter Xu wrote: > On Fri, Sep 30, 2022 at 10:47:45PM -0400, Peter Xu wrote: > From fe9e50551f3fdb7107315784affca4f9b1c4720f Mon Sep 17 00:00:00 2001 > From: Peter Xu <peterx@xxxxxxxxxx> > Date: Fri, 30 Sep 2022 22:22:44 -0400 > Subject: [PATCH] mm/hugetlb: Fix race condition of uffd missing handling > Content-type: text/plain > > Signed-off-by: Peter Xu <peterx@xxxxxxxxxx> > --- > mm/hugetlb.c | 36 +++++++++++++++++++++++++++++++++--- > 1 file changed, 33 insertions(+), 3 deletions(-) > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index dd29cba46e9e..5015d8aa5da4 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -5557,9 +5557,39 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, > if (!page) { > /* Check for page in userfault range */ > if (userfaultfd_missing(vma)) { > - ret = hugetlb_handle_userfault(vma, mapping, idx, > - flags, haddr, address, > - VM_UFFD_MISSING); > + bool same; > + > + /* > + * Since hugetlb_no_page() was examining pte > + * without pgtable lock, we need to re-test under > + * lock because the pte may not be stable and could > + * have changed from under us. Try to detect > + * either changed or during-changing ptes and bail > + * out properly. > + * > + * One example of changing pte is in-progress CoW > + * of private mapping, which will clear+flush pte > + * then reinstall the new one. > + * > + * Note that userfaultfd is actually fine with > + * false positives (e.g. caused by pte changed), > + * but not wrong logical events (e.g. caused by > + * reading a pte during changing). The latter can > + * confuse the userspace, so the strictness is very > + * much preferred. E.g., MISSING event should > + * never happen on the page after UFFDIO_COPY has > + * correctly installed the page and returned. > + */ Thanks Peter! The wording and pte_same check here is better than what I proposed. I think that last paragraph above should go into the commit message as it describes user visible effects (missing event after UFFDIO_COPY has correctly installed the page and returned). This seems to have existed since hugetlb userfault support was added. It just became exposed recently due to locking changes going into 6.1. However, I think it may have existed in the window after hugetlb userfault support was added and before current i_mmap_sema locking for pmd sharing was added. Just a long way of saying I am not sure cc stable if of much value. -- Mike Kravetz > + ptl = huge_pte_lock(h, mm, ptep); > + same = pte_same(huge_ptep_get(ptep), old_pte); > + spin_unlock(ptl); > + if (same) > + ret = hugetlb_handle_userfault(vma, mapping, idx, > + flags, haddr, address, > + VM_UFFD_MISSING); > + else > + /* PTE changed or unstable, retry */ > + ret = 0; > goto out; > } > > -- > 2.37.3 >