On Fri, 2022-10-14 at 17:52 +0200, Peter Zijlstra wrote: > On Thu, Sep 29, 2022 at 03:29:14PM -0700, Rick Edgecombe wrote: > > diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c > > index 7327b2573f7c..b49372c7de41 100644 > > --- a/mm/userfaultfd.c > > +++ b/mm/userfaultfd.c > > @@ -63,6 +63,7 @@ int mfill_atomic_install_pte(struct mm_struct > > *dst_mm, pmd_t *dst_pmd, > > int ret; > > pte_t _dst_pte, *dst_pte; > > bool writable = dst_vma->vm_flags & VM_WRITE; > > + bool shstk = dst_vma->vm_flags & VM_SHADOW_STACK; > > bool vm_shared = dst_vma->vm_flags & VM_SHARED; > > bool page_in_cache = page->mapping; > > spinlock_t *ptl; > > @@ -83,9 +84,12 @@ int mfill_atomic_install_pte(struct mm_struct > > *dst_mm, pmd_t *dst_pmd, > > writable = false; > > } > > > > - if (writable) > > - _dst_pte = pte_mkwrite(_dst_pte); > > - else > > + if (writable) { > > + if (shstk) > > + _dst_pte = pte_mkwrite_shstk(_dst_pte); > > + else > > + _dst_pte = pte_mkwrite(_dst_pte); > > + } else > > /* > > * We need this to make sure write bit removed; as > > mk_pte() > > * could return a pte with write bit set. > > Urgh.. that's unfortunate. But yeah, I don't see a way to make that > pretty either. Nadav pointed out that: entry = maybe_mkwrite(pte_mkdirty(entry), vma); and: if (vma->vm_flags & VM_WRITE) entry = pte_mkwrite(pte_mkdirty(entry)); Are not actually the same, because in the former the non-writable PTE gets marked dirty. So I was actually going to add two more cases like the ugly case.