On 2/6/22 22:42, Hugh Dickins wrote: > Fill in missing pieces: reimplementation of munlock_vma_pages_range(), > required to lower the mlock_counts when munlocking without munmapping; > and its complement, implementation of mlock_vma_pages_range(), required > to raise the mlock_counts on pages already there when a range is mlocked. > > Combine them into just the one function mlock_vma_pages_range(), using > walk_page_range() to run mlock_pte_range(). This approach fixes the > "Very slow unlockall()" of unpopulated PROT_NONE areas, reported in > https://lore.kernel.org/linux-mm/70885d37-62b7-748b-29df-9e94f3291736@xxxxxxxxx/ > > Munlock clears VM_LOCKED at the start, under exclusive mmap_lock; but if > a racing truncate or holepunch (depending on i_mmap_rwsem) gets to the > pte first, it will not try to munlock the page: leaving release_pages() > to correct it when the last reference to the page is gone - that's okay, > a page is not evictable anyway while it is held by an extra reference. > > Mlock sets VM_LOCKED at the start, under exclusive mmap_lock; but if > a racing remove_migration_pte() or try_to_unmap_one() (depending on > i_mmap_rwsem) gets to the pte first, it will try to mlock the page, > then mlock_pte_range() mlock it a second time. This is harder to > reproduce, but a more serious race because it could leave the page > unevictable indefinitely though the area is munlocked afterwards. > Guard against it by setting the (inappropriate) VM_IO flag, > and modifying mlock_vma_page() to decline such vmas. > > Signed-off-by: Hugh Dickins <hughd@xxxxxxxxxx> Acked-by: Vlastimil Babka <vbabka@xxxxxxx> > @@ -162,8 +230,7 @@ static int mlock_fixup(struct vm_area_struct *vma, struct vm_area_struct **prev, > pgoff_t pgoff; > int nr_pages; > int ret = 0; > - int lock = !!(newflags & VM_LOCKED); > - vm_flags_t old_flags = vma->vm_flags; > + vm_flags_t oldflags = vma->vm_flags; > > if (newflags == vma->vm_flags || (vma->vm_flags & VM_SPECIAL) || Nit: can use oldflags instead of vma->vm_flags above?