On 03/24/23 10:26, Peter Xu wrote: > This patch fixes an issue that a hugetlb uffd-wr-protected mapping can be > writable even with uffd-wp bit set. It only happens with hugetlb private > mappings, when someone firstly wr-protects a missing pte (which will > install a pte marker), then a write to the same page without any prior > access to the page. > > Userfaultfd-wp trap for hugetlb was implemented in hugetlb_fault() before > reaching hugetlb_wp() to avoid taking more locks that userfault won't need. > However there's one CoW optimization path that can trigger hugetlb_wp() > inside hugetlb_no_page(), which will bypass the trap. > > This patch skips hugetlb_wp() for CoW and retries the fault if uffd-wp bit > is detected. The new path will only trigger in the CoW optimization path > because generic hugetlb_fault() (e.g. when a present pte was wr-protected) > will resolve the uffd-wp bit already. Also make sure anonymous UNSHARE > won't be affected and can still be resolved, IOW only skip CoW not CoR. > > This patch will be needed for v5.19+ hence copy stable. > > Reported-by: Muhammad Usama Anjum <usama.anjum@xxxxxxxxxxxxx> > Cc: linux-stable <stable@xxxxxxxxxxxxxxx> > Fixes: 166f3ecc0daf ("mm/hugetlb: hook page faults for uffd write protection") > Signed-off-by: Peter Xu <peterx@xxxxxxxxxx> > --- > > Notes: > > v2 is not on the list but in an attachment in the reply; this v3 is mostly > to make sure it's not the same as the patch used to be attached. Sorry > Andrew, we need to drop the queued one as I rewrote the commit message. My appologies! I saw the code path missed in v2 and assumed you did not think it applied. So, I said nothing. My bad! > Muhammad, I didn't attach your T-b because of the slight functional change. > Please feel free to re-attach if it still works for you (which I believe > should). > > thanks, > --- > mm/hugetlb.c | 14 ++++++++++++-- > 1 file changed, 12 insertions(+), 2 deletions(-) > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index 8bfd07f4c143..a58b3739ed4b 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -5478,7 +5478,7 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma, > struct folio *pagecache_folio, spinlock_t *ptl) > { > const bool unshare = flags & FAULT_FLAG_UNSHARE; > - pte_t pte; > + pte_t pte = huge_ptep_get(ptep); > struct hstate *h = hstate_vma(vma); > struct page *old_page; > struct folio *new_folio; > @@ -5487,6 +5487,17 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma, > unsigned long haddr = address & huge_page_mask(h); > struct mmu_notifier_range range; > > + /* > + * Never handle CoW for uffd-wp protected pages. It should be only > + * handled when the uffd-wp protection is removed. > + * > + * Note that only the CoW optimization path (in hugetlb_no_page()) > + * can trigger this, because hugetlb_fault() will always resolve > + * uffd-wp bit first. > + */ > + if (!unshare && huge_pte_uffd_wp(pte)) > + return 0; This looks correct. However, since the previous version looked correct I must ask. Can we have unshare set and huge_pte_uffd_wp true? If so, then it seems we would need to possibly propogate that uffd_wp to the new pte as in v2 > + > /* > * hugetlb does not support FOLL_FORCE-style write faults that keep the > * PTE mapped R/O such as maybe_mkwrite() would do. > @@ -5500,7 +5511,6 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma, > return 0; > } > > - pte = huge_ptep_get(ptep); > old_page = pte_page(pte); > > delayacct_wpcopy_start(); > -- > 2.39.1 > -- Mike Kravetz