On Thu, Oct 21, 2021 at 7:25 PM Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote: > > On Thu, 21 Oct 2021 18:46:58 -0700 Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote: > > > Race between process_mrelease and exit_mmap, where free_pgtables is > > called while __oom_reap_task_mm is in progress, leads to kernel crash > > during pte_offset_map_lock call. oom-reaper avoids this race by setting > > MMF_OOM_VICTIM flag and causing exit_mmap to take and release > > mmap_write_lock, blocking it until oom-reaper releases mmap_read_lock. > > Reusing MMF_OOM_VICTIM for process_mrelease would be the simplest way to > > fix this race, however that would be considered a hack. Fix this race > > by elevating mm->mm_users and preventing exit_mmap from executing until > > process_mrelease is finished. Patch slightly refactors the code to adapt > > for a possible mmget_not_zero failure. > > This fix has considerable negative impact on process_mrelease performance > > and will likely need later optimization. > > Has the impact been quantified? A ball-park figure for a large process (6GB) it takes 4x times longer for process_mrelease to exit. > > And where's the added cost happening? The changes all look quite > lightweight? I think it's caused by the fact that exit_mmap and all other cleanup routines happening on the last mmput are postponed until process_mrelease finishes __oom_reap_task_mm and drops mm->mm_users. I suspect all that cleanup is happening at the end of process_mrelease now and that might be contributing to the regression. I didn't have time yet to fully understand all the reasons for that regression but wanted to fix the crash first. Will proceed with more investigation and hopefully with a quick fix for the lost performance. > > -- > To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@xxxxxxxxxxx. >