Re: [patch] mm, memcg: add oom killer delay

Michal Hocko <mhocko@xxxxxxx> · Tue, 4 Jun 2013 21:27:08 +0200

On Tue 04-06-13 14:48:52, Johannes Weiner wrote:
> On Tue, Jun 04, 2013 at 11:17:49AM +0200, Michal Hocko wrote:
[...]
> > > diff --git a/mm/memory.c b/mm/memory.c
> > > index 6dc1882..ff5e2d7 100644
> > > --- a/mm/memory.c
> > > +++ b/mm/memory.c
> > > @@ -1815,7 +1815,7 @@ long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm,
> > >  			while (!(page = follow_page_mask(vma, start,
> > >  						foll_flags, &page_mask))) {
> > >  				int ret;
> > > -				unsigned int fault_flags = 0;
> > > +				unsigned int fault_flags = FAULT_FLAG_KERNEL;
> > >  
> > >  				/* For mlock, just skip the stack guard page. */
> > >  				if (foll_flags & FOLL_MLOCK) {
> > 
> > This is also a bit tricky. Say there is an unlikely situation when a
> > task fails to charge because of memcg OOM, it couldn't lock the oom
> > so it ended up with current->memcg_oom set and __get_user_pages will
> > turn VM_FAULT_OOM into ENOMEM but memcg_oom is still there. Then the
> > following global OOM condition gets confused (well the oom will be
> > triggered by somebody else so it shouldn't end up in the endless loop
> > but still...), doesn't it?
> 
> But current->memcg_oom is not set up unless current->in_userfault.
> And get_user_pages does not set this flag.

And my selective blindness strikes again :/ For some reason I have read
those places as they enable the fault flag. Which would make some sense
if there was a post handling...

Anyway, I will get back to the updated patch tomorrow with a clean and
fresh head.

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>