Michal Hocko wrote: > From: Michal Hocko <mhocko@xxxxxxxx> > > 0-day robot has encountered the following: > [ 82.694232] Out of memory: Kill process 3914 (trinity-c0) score 167 or sacrifice child > [ 82.695110] Killed process 3914 (trinity-c0) total-vm:55864kB, anon-rss:1512kB, file-rss:1088kB, shmem-rss:25616kB > [ 82.706724] oom_reaper: reaped process 3914 (trinity-c0), now anon-rss:0kB, file-rss:0kB, shmem-rss:26488kB > [ 82.715540] oom_reaper: reaped process 3914 (trinity-c0), now anon-rss:0kB, file-rss:0kB, shmem-rss:26900kB > [ 82.717662] oom_reaper: reaped process 3914 (trinity-c0), now anon-rss:0kB, file-rss:0kB, shmem-rss:26900kB > [ 82.725804] oom_reaper: reaped process 3914 (trinity-c0), now anon-rss:0kB, file-rss:0kB, shmem-rss:27296kB > [ 82.739091] oom_reaper: reaped process 3914 (trinity-c0), now anon-rss:0kB, file-rss:0kB, shmem-rss:28148kB > > oom_reaper is trying to reap the same task again and again. This > is possible only when the oom killer is bypassed because of > task_will_free_mem because we skip over tasks with MMF_OOM_REAPED > already set during select_bad_process. Teach task_will_free_mem to skip > over MMF_OOM_REAPED tasks as well because they will be unlikely to free > anything more. I agree that we need to prevent same mm from being selected forever. But I feel worried about this patch. We are reaching a stage what purpose we set TIF_MEMDIE for. mark_oom_victim() sets TIF_MEMDIE on a thread with oom_lock held. Thus, if a mm which the TIF_MEMDIE thread is using is reapable (likely yes), __oom_reap_task() will likely be the next thread which will get that lock because __oom_reap_task() uses mutex_lock(&oom_lock) whereas other threads using that mm use mutex_trylock(&oom_lock). As a result, regarding CONFIG_MMU=y kernels, I guess that if (task_will_free_mem(current)) { shortcut in out_of_memory() likely becomes an useless condition. Since the OOM reaper will quickly reap mm and set MMF_OOM_REAPED on that mm and clear TIF_MEMDIE, other threads using that mm will fail to get TIF_MEMDIE (because task_will_free_mem() will start returning false due to this patch) and proceed to next OOM victim selection. The comment * That thread will now get access to memory reserves since it has a * pending fatal signal. in oom_kill_process() became almost dead. Since we need a short delay in order to allow get_page_from_freelist() to allocate from memory reclaimed by __oom_reap_task(), this patch might increase possibility of excessively preventing OOM-killed threads from using ALLOC_NO_WATERMARKS via TIF_MEMDIE and increase possibility of needlessly selecting next OOM victim. So, maybe we shouldn't let this shortcut to return false as soon as MMF_OOM_REAPED is set. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>