On Tue, Aug 1, 2023 at 3:07 PM Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote: > > During recent vma locking patch reviews Linus and Jann Horn noted a number > of issues with vma locking and suggested improvements: > > 1. walk_page_range() does not have ability to write-lock a vma during the > walk when it's done under mmap_write_lock. For example s390_reset_cmma(). > > 2. Vma locking is hidden inside vm_flags modifiers and is hard to follow. > Suggestion is to change vm_flags_reset{_once} to assert that vma is > write-locked and require an explicit locking. > > 3. Same issue with vma_prepare() hiding vma locking. > > 4. In userfaultfd vm_flags are modified after vma->vm_userfaultfd_ctx and > page faults can operate on a context while it's changed. > > 5. do_brk_flags() and __install_special_mapping() not locking a newly > created vma before adding it into the mm. While not strictly a problem, > this is fragile if vma is modified after insertion, as in the > mmap_region() case which was recently fixed. Suggestion is to always lock > a new vma before inserting it and making it visible to page faults. > > 6. vma_assert_write_locked() for CONFIG_PER_VMA_LOCK=n would benefit from > being mmap_assert_write_locked() instead of no-op and then any place which > operates on a vma and calls mmap_assert_write_locked() can be converted > into vma_assert_write_locked(). > > I CC'ed stable only on the first patch because others are cleanups and the > bug in userfaultfd does not affect stable (lock_vma_under_rcu prevents > uffds from being handled under vma lock protection). However I would be > happy if the whole series is merged into stable 6.4 since it makes vma > locking more maintainable. > > The patches apply cleanly over Linus' ToT and will conflict when applied > over mm-unstable due to missing [1]. The conflict can be easily resolved > by ignoring conflicting deletions but probably simpler to take [1] into > mm-unstable and avoid later conflict. > > [1] commit 6c21e066f925 ("mm/mempolicy: Take VMA lock before replacing policy") > > Changes since v1: > - replace walk_page_range() parameter with mm_walk_ops.walk_lock, > per Linus > - introduced page_walk_lock enum to allow different locking modes > during a walk, per Linus > - added Liam's Reviewed-by v3 is posted at https://lore.kernel.org/all/20230803172652.2849981-1-surenb@xxxxxxxxxx/ > > Suren Baghdasaryan (6): > mm: enable page walking API to lock vmas during the walk > mm: for !CONFIG_PER_VMA_LOCK equate write lock assertion for vma and > mmap > mm: replace mmap with vma write lock assertions when operating on a > vma > mm: lock vma explicitly before doing vm_flags_reset and > vm_flags_reset_once > mm: always lock new vma before inserting into vma tree > mm: move vma locking out of vma_prepare > > arch/powerpc/kvm/book3s_hv_uvmem.c | 1 + > arch/powerpc/mm/book3s64/subpage_prot.c | 1 + > arch/riscv/mm/pageattr.c | 1 + > arch/s390/mm/gmap.c | 5 ++++ > drivers/infiniband/hw/hfi1/file_ops.c | 1 + > fs/proc/task_mmu.c | 5 ++++ > fs/userfaultfd.c | 6 +++++ > include/linux/mm.h | 13 ++++++--- > include/linux/pagewalk.h | 11 ++++++++ > mm/damon/vaddr.c | 2 ++ > mm/hmm.c | 1 + > mm/hugetlb.c | 2 +- > mm/khugepaged.c | 5 ++-- > mm/ksm.c | 25 ++++++++++------- > mm/madvise.c | 8 +++--- > mm/memcontrol.c | 2 ++ > mm/memory-failure.c | 1 + > mm/memory.c | 2 +- > mm/mempolicy.c | 22 +++++++++------ > mm/migrate_device.c | 1 + > mm/mincore.c | 1 + > mm/mlock.c | 4 ++- > mm/mmap.c | 29 +++++++++++++------- > mm/mprotect.c | 2 ++ > mm/pagewalk.c | 36 ++++++++++++++++++++++--- > mm/vmscan.c | 1 + > 26 files changed, 146 insertions(+), 42 deletions(-) > > -- > 2.41.0.585.gd2178a4bd4-goog >