* Suren Baghdasaryan <surenb@xxxxxxxxxx> [240130 22:03]: > On Mon, Jan 29, 2024 at 6:58 PM Liam R. Howlett <Liam.Howlett@xxxxxxxxxx> wrote: ... > > > > > > @@ -730,7 +759,7 @@ static __always_inline ssize_t mfill_atomic(struct userfaultfd_ctx *ctx, > > > > > > > > > > > > out_unlock: > > > > > > up_read(&ctx->map_changing_lock); > > > > > > - mmap_read_unlock(dst_mm); > > > > > > + unpin_vma(dst_mm, dst_vma, &mmap_locked); > > > > > > out: > > > > > > if (folio) > > > > > > folio_put(folio); > > > > > > @@ -1285,8 +1314,6 @@ static int validate_move_areas(struct userfaultfd_ctx *ctx, > > > > > > * @len: length of the virtual memory range > > > > > > * @mode: flags from uffdio_move.mode > > > > > > * > > > > > > - * Must be called with mmap_lock held for read. > > > > > > - * > > > > > > * move_pages() remaps arbitrary anonymous pages atomically in zero > > > > > > * copy. It only works on non shared anonymous pages because those can > > > > > > * be relocated without generating non linear anon_vmas in the rmap > > > > > > @@ -1353,15 +1380,16 @@ static int validate_move_areas(struct userfaultfd_ctx *ctx, > > > > > > * could be obtained. This is the only additional complexity added to > > > > > > * the rmap code to provide this anonymous page remapping functionality. > > > > > > */ > > > > > > -ssize_t move_pages(struct userfaultfd_ctx *ctx, struct mm_struct *mm, > > > > > > - unsigned long dst_start, unsigned long src_start, > > > > > > - unsigned long len, __u64 mode) > > > > > > +ssize_t move_pages(struct userfaultfd_ctx *ctx, unsigned long dst_start, > > > > > > + unsigned long src_start, unsigned long len, __u64 mode) > > > > > > { > > > > > > + struct mm_struct *mm = ctx->mm; > > > > > > struct vm_area_struct *src_vma, *dst_vma; > > > > > > unsigned long src_addr, dst_addr; > > > > > > pmd_t *src_pmd, *dst_pmd; > > > > > > long err = -EINVAL; > > > > > > ssize_t moved = 0; > > > > > > + bool mmap_locked = false; > > > > > > > > > > > > /* Sanitize the command parameters. */ > > > > > > if (WARN_ON_ONCE(src_start & ~PAGE_MASK) || > > > > > > @@ -1374,28 +1402,52 @@ ssize_t move_pages(struct userfaultfd_ctx *ctx, struct mm_struct *mm, > > > > > > WARN_ON_ONCE(dst_start + len <= dst_start)) > > > > > > goto out; > > > > > > > > > > Ah, is this safe for rmap? I think you need to leave this read lock. > > > > > > > > I didn't fully understand you here. > > > > Sorry, I'm confused on how your locking scheme avoids rmap from trying > > to use the VMA with the atomic increment part. > > I'm also a bit confused. Which atomic increment are you referring to? > AFAIU move_pages() will lock both src_vma and dst_vma, so even if rmap > finds them it can't modify them, no? The uffd atomic, mmap_changing. ...