On Wed, 28 Apr 2021 18:50:26 -0400 Peter Xu <peterx@xxxxxxxxxx> wrote: > We tried to do something similar in b569a1760782 ("userfaultfd: wp: drop > _PAGE_UFFD_WP properly when fork") previously, but it's not doing it all > right.. A few fixes around the code path: > > 1. We were referencing VM_UFFD_WP vm_flags on the _old_ vma rather than the > new vma. That's overlooked in b569a1760782, so it won't work as expected. > Thanks to the recent rework on fork code (7a4830c380f3a8b3), we can easily > get the new vma now, so switch the checks to that. > > 2. Dropping the uffd-wp bit in copy_huge_pmd() could be wrong if the huge pmd > is a migration huge pmd. When it happens, instead of using pmd_uffd_wp(), > we should use pmd_swp_uffd_wp(). The fix is simply to handle them separately. > > 3. Forget to carry over uffd-wp bit for a write migration huge pmd entry. > This also happens in copy_huge_pmd(), where we converted a write huge > migration entry into a read one. > > 4. In copy_nonpresent_pte(), drop uffd-wp if necessary for swap ptes. > > 5. In copy_present_page() when COW is enforced when fork(), we also need to > pass over the uffd-wp bit if VM_UFFD_WP is armed on the new vma, and when > the pte to be copied has uffd-wp bit set. > > Remove the comment in copy_present_pte() about this. It won't help a huge lot > to only comment there, but comment everywhere would be an overkill. Let's > assume the commit messages would help. > This run afoul of Alistair's "mm: Device exclusive memory access", https://lkml.kernel.org/r/20210524132725.12697-8-apopple@xxxxxxxxxx `vma' is now undeclared. I think this? --- a/mm/memory.c~mm-userfaultfd-fix-uffd-wp-special-cases-for-fork-fix +++ a/mm/memory.c @@ -850,8 +850,8 @@ copy_nonpresent_pte(struct mm_struct *ds * exclusive entries currently only support private writable * (ie. COW) mappings. */ - VM_BUG_ON(!is_cow_mapping(vma->vm_flags)); - if (try_restore_exclusive_pte(src_mm, src_pte, vma, addr)) + VM_BUG_ON(!is_cow_mapping(dst_vma->vm_flags)); + if (try_restore_exclusive_pte(src_mm, src_pte, dst_vma, addr)) return -EBUSY; return -ENOENT; } _