On Thu, Dec 09, 2021 at 11:13:25AM -0800, Suren Baghdasaryan wrote: > With exit_mmap holding mmap_write_lock during free_pgtables call, > process_mrelease does not need to elevate mm->mm_users in order to > prevent exit_mmap from destrying pagetables while __oom_reap_task_mm > is walking the VMA tree. The change prevents process_mrelease from > calling the last mmput, which can lead to waiting for IO completion > in exit_aio. > > Signed-off-by: Suren Baghdasaryan <surenb@xxxxxxxxxx> > Acked-by: Michal Hocko <mhocko@xxxxxxxx> > --- > changes in v5 > - Removed Fixes: tag, per Michal Hocko > - Added Acked-by's > > mm/oom_kill.c | 27 +++++++++++++++------------ > 1 file changed, 15 insertions(+), 12 deletions(-) Reviewed-by: Jason Gunthorpe <jgg@xxxxxxxxxx> There are mmget_not_zero's all over the place, can others be cleaned after this series goes ahead too? It seems like anything doing the mmget just to look at the vma list under the mmap lock is now fine with only a mmgrab? A few I know about: drivers/infiniband/core/umem_odp.c: if (!mmget_not_zero(umem->owning_mm)) { This is because mmu_interval_notifier_insert() might call mm_take_all_locks() which was unsafe with concurrent exit_mmap drivers/infiniband/core/umem_odp.c: if (!owning_process || !mmget_not_zero(owning_mm)) { This is because it calls hmm_range_fault() which iterates over the vma list which is safe now drivers/iommu/iommu-sva-lib.c: return mmget_not_zero(mm); drivers/iommu/iommu-sva-lib.c: return ioasid_find(&iommu_sva_pasid, pasid, __mmget_not_zero); It calls find_extend_vma() - but also it doesn't seem to have a mmgrab when it does that mmget. The rcu is messed up here too, so humm. Jason