On 03/21/23 15:18, Peter Xu wrote: Thanks Peter! > This patch fixes an issue that a hugetlb uffd-wr-protected mapping can be > writable even with uffd-wp bit set. It only happens with all these > conditions met: (1) hugetlb memory (2) private mapping (3) original mapping > was missing, then (4) being wr-protected (IOW, pte marker installed). Then ^^^^ Nit, but is the word "then" intended to be there? Almost makes it sound as if wr-protected was a result of the previous 3 conditions being met. > write to the page to trigger. > > Userfaultfd-wp trap for hugetlb was implemented in hugetlb_fault() before > even reaching hugetlb_wp() to avoid taking more locks that userfault won't > need. However there's one CoW optimization path for missing hugetlb page > that can trigger hugetlb_wp() inside hugetlb_no_page(), that can bypass the > userfaultfd-wp traps. > > A few ways to resolve this: > > (1) Skip the CoW optimization for hugetlb private mapping, considering > that private mappings for hugetlb should be very rare, so it may not > really be helpful to major workloads. The worst case is we only skip the > optimization if userfaultfd_wp(vma)==true, because uffd-wp needs another > fault anyway. > > (2) Move the userfaultfd-wp handling for hugetlb from hugetlb_fault() > into hugetlb_wp(). The major cons is there're a bunch of locks taken > when calling hugetlb_wp(), and that will make the changeset unnecessarily > complicated due to the lock operations. > > (3) Carry over uffd-wp bit in hugetlb_wp(), so it'll need to fault again > for uffd-wp privately mapped pages. > > This patch chose option (3) which contains the minimum changeset (simplest > for backport) and also make sure hugetlb_wp() itself will start to be > always safe with uffd-wp ptes even if called elsewhere in the future. I was going to suggest (1) as that would be the simplest. But, since (3) makes hugetlb_wp() safe for future callers, that is actually preferred. > > This patch will be needed for v5.19+ hence copy stable. > > Reported-by: Muhammad Usama Anjum <usama.anjum@xxxxxxxxxxxxx> > Cc: linux-stable <stable@xxxxxxxxxxxxxxx> > Fixes: 166f3ecc0daf ("mm/hugetlb: hook page faults for uffd write protection") > Signed-off-by: Peter Xu <peterx@xxxxxxxxxx> > --- > mm/hugetlb.c | 8 +++++--- > 1 file changed, 5 insertions(+), 3 deletions(-) > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index 8bfd07f4c143..22337b191eae 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -5478,7 +5478,7 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma, > struct folio *pagecache_folio, spinlock_t *ptl) > { > const bool unshare = flags & FAULT_FLAG_UNSHARE; > - pte_t pte; > + pte_t pte, newpte; > struct hstate *h = hstate_vma(vma); > struct page *old_page; > struct folio *new_folio; > @@ -5622,8 +5622,10 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma, > mmu_notifier_invalidate_range(mm, range.start, range.end); > page_remove_rmap(old_page, vma, true); > hugepage_add_new_anon_rmap(new_folio, vma, haddr); > - set_huge_pte_at(mm, haddr, ptep, > - make_huge_pte(vma, &new_folio->page, !unshare)); > + newpte = make_huge_pte(vma, &new_folio->page, !unshare); > + if (huge_pte_uffd_wp(pte)) > + newpte = huge_pte_mkuffd_wp(newpte); > + set_huge_pte_at(mm, haddr, ptep, newpte); > folio_set_hugetlb_migratable(new_folio); > /* Make the old page be freed below */ > new_folio = page_folio(old_page); > -- > 2.39.1 > Reviewed-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx> -- Mike Kravetz