On Tue, Apr 27, 2021 at 12:12:58PM -0400, Peter Xu wrote: > File-backed memories are prone to unmap/swap so the ptes are always unstable. > This could lead to userfaultfd-wp information got lost when unmapped or swapped > out on such types of memory, for example, shmem. To keep such an information > persistent, we will start to use the newly introduced swap-like special ptes to > replace a null pte when those ptes were removed. > > Prepare this by handling such a special pte first before it is applied. Here > a new fault flag FAULT_FLAG_UFFD_WP is introduced. When this flag is set, it FAULT_FLAG_UFFD_WP does not exist any more. Obviously I should have touched up the commit message when touching up the code... > means the current fault is to resolve a page access (either read or write) to > the uffd-wp special pte. > > The handling of this special pte page fault is similar to missing fault, but it > should happen after the pte missing logic since the special pte is designed to > be a swap-like pte. Meanwhile it should be handled before do_swap_page() so > that the swap core logic won't be confused to see such an illegal swap pte. > > This is a slow path of uffd-wp handling, because unmap of wr-protected shmem > ptes should be rare. So far it should only trigger in two conditions: > > (1) When trying to punch holes in shmem_fallocate(), there will be a > pre-unmap optimization before evicting the page. That will create > unmapped shmem ptes with wr-protected pages covered. > > (2) Swapping out of shmem pages > > Because of this, the page fault handling is simplifed too by not sending the > wr-protect message in the 1st page fault, instead the page will be installed > read-only, so the message will be generated until the next do_wp_page() call. > > Disable fault-around for such a special page fault, because the introduced new > flag (FAULT_FLAG_UFFD_WP) only applies to current pte rather than all the pages Same here. > around it. Doing fault-around with the new flag could confuse all the rest of > pages when installing ptes from page cache when there's a cache hit. > > Signed-off-by: Peter Xu <peterx@xxxxxxxxxx> > --- > include/linux/userfaultfd_k.h | 11 +++++ > mm/memory.c | 80 ++++++++++++++++++++++++++++++++--- > 2 files changed, 86 insertions(+), 5 deletions(-) > > diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h > index bc733512c6905..fefebe6e96560 100644 > --- a/include/linux/userfaultfd_k.h > +++ b/include/linux/userfaultfd_k.h > @@ -89,6 +89,17 @@ static inline bool uffd_disable_huge_pmd_share(struct vm_area_struct *vma) > return vma->vm_flags & (VM_UFFD_WP | VM_UFFD_MINOR); > } > > +/* > + * Don't do fault around for FAULT_FLAG_UFFD_WP because it means we want to Same here... > + * recover a previously wr-protected pte. This flag is a per-pte information, > + * so it could confuse all the pages around the current page when faulted in. > + * Similar reason for MINOR mode faults. > + */ > +static inline bool uffd_disable_fault_around(struct vm_area_struct *vma) > +{ > + return vma->vm_flags & (VM_UFFD_WP | VM_UFFD_MINOR); > +} -- Peter Xu