On Sun 04-09-16 10:49:42, Tetsuo Handa wrote: > Michal Hocko wrote: [...] > > @@ -3309,6 +3318,22 @@ gfp_to_alloc_flags(gfp_t gfp_mask) > > return alloc_flags; > > } > > > > +static bool oom_reserves_allowed(struct task_struct *tsk) > > +{ > > + if (!tsk_is_oom_victim(tsk)) > > + return false; > > + > > + /* > > + * !MMU doesn't have oom reaper so we shouldn't risk the memory reserves > > + * depletion and shouldn't give access to memory reserves passed the > > + * exit_mm > > + */ > > + if (!IS_ENABLED(CONFIG_MMU) && !tsk->mm) > > + return false; > > + > > + return true; > > +} > > + > > Are you aware that you are trying to make !MMU kernel's allocations not only > after returning exit_mm() but also from __mmput() from mmput() from exit_mm() > fail without allowing access to memory reserves? Do we allocate from that path in !mmu and would that be more broken than with the current code which clears TIF_MEMDIE after mmput even when __mmput is not called (aka somebody is holding a reference to mm - e.g. a proc file)? > The comment says only after returning exit_mm(), but this change is > not. I can see that the comment is not ideal. Any suggestion how to make it better? > > @@ -3558,8 +3593,8 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order, > > goto nopage; > > } > > > > - /* Avoid allocations with no watermarks from looping endlessly */ > > - if (test_thread_flag(TIF_MEMDIE) && !(gfp_mask & __GFP_NOFAIL)) > > + /* Avoid allocations for oom victims from looping endlessly */ > > + if (tsk_is_oom_victim(current) && !(gfp_mask & __GFP_NOFAIL)) > > goto nopage; > > This change increases possibility of giving up without trying ALLOC_OOM > (more allocation failure messages), for currently only one thread which > remotely got TIF_MEMDIE when it was between gfp_to_alloc_flags() and > test_thread_flag(TIF_MEMDIE) will give up without trying ALLOC_NO_WATERMARKS > while all threads which remotely got current->signal->oom_mm when they were > between gfp_to_alloc_flags() and test_thread_flag(TIF_MEMDIE) will give up > without trying ALLOC_OOM. I think we should make sure that ALLOC_OOM is > tried (by using a variable which remembers whether > get_page_from_freelist(ALLOC_OOM) was tried). Technically speaking you are right but I am not really sure that this matters all that much. This code as always been racy. If we ever consider the race harmfull we can reorganize the allo slow path in a way to guarantee at least one allocation attempt with ALLOC_OOM I am just not sure it is necessary right now. If this ever shows up as a problem we would see a flood of allocation failures followed by the OOM report so it would be quite easy to notice. > We are currently allowing TIF_MEMDIE threads try ALLOC_NO_WATERMARKS for > once and give up without invoking the OOM killer. This change makes > current->signal->oom_mm threads try ALLOC_OOM for once and give up without > invoking the OOM killer. This means that allocations for cleanly cleaning > up by oom victims might fail prematurely, but we don't want to scatter > around __GFP_NOFAIL. Since there are reasonable chances of the parallel > memory freeing, we don't need to give up without invoking the OOM killer > again. I think that > > - /* Avoid allocations with no watermarks from looping endlessly */ > - if (test_thread_flag(TIF_MEMDIE) && !(gfp_mask & __GFP_NOFAIL)) > +#ifndef CONFIG_MMU > + /* Avoid allocations for oom victims from looping endlessly */ > + if (tsk_is_oom_victim(current) && !(gfp_mask & __GFP_NOFAIL)) > + goto nopage; > +#endif > > is possible. I would prefer to not spread out MMU ifdefs all over the place. -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>