Re: [BUGFIX][PATCH] memcg: fix oom killer kills a task in other cgroup

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 9 Feb 2010 10:24:45 +0900
Minchan Kim <minchan.kim@xxxxxxxxx> wrote:

> On Tue, Feb 9, 2010 at 9:32 AM, KAMEZAWA Hiroyuki
> <kamezawa.hiroyu@xxxxxxxxxxxxxx> wrote:
> > On Sat, 6 Feb 2010 01:30:49 +0900
> > Minchan Kim <minchan.kim@xxxxxxxxx> wrote:
> >
> >> Hi, Kame.
> >>
> >> On Fri, Feb 5, 2010 at 9:39 AM, KAMEZAWA Hiroyuki
> >> <kamezawa.hiroyu@xxxxxxxxxxxxxx> wrote:
> >> > Please take this patch in different context with recent discussion.
> >> > This is a quick-fix for a terrible bug.
> >> >
> >> > This patch itself is against mmotm but can be easily applied to mainline or
> >> > stable tree, I think. (But I don't CC stable tree until I get ack.)
> >> >
> >> > ==
> >> > Now, oom-killer kills process's chidlren at first. But this means
> >> > a child in other cgroup can be killed. But it's not checked now.
> >> >
> >> > This patch fixes that.
> >> >
> >> > CC: Balbir Singh <balbir@xxxxxxxxxxxxxxxxxx>
> >> > CC: Daisuke Nishimura <nishimura@xxxxxxxxxxxxxxxxx>
> >> > Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx>
> >> > ---
> >> >  mm/oom_kill.c |    3 +++
> >> >  1 file changed, 3 insertions(+)
> >> >
> >> > Index: mmotm-2.6.33-Feb03/mm/oom_kill.c
> >> > ===================================================================
> >> > --- mmotm-2.6.33-Feb03.orig/mm/oom_kill.c
> >> > +++ mmotm-2.6.33-Feb03/mm/oom_kill.c
> >> > @@ -459,6 +459,9 @@ static int oom_kill_process(struct task_
> >> >        list_for_each_entry(c, &p->children, sibling) {
> >> >                if (c->mm == p->mm)
> >> >                        continue;
> >> > +               /* Children may be in other cgroup */
> >> > +               if (mem && !task_in_mem_cgroup(c, mem))
> >> > +                       continue;
> >> >                if (!oom_kill_task(c))
> >> >                        return 0;
> >> >        }
> >> >
> >> > --
> >>
> >> I am worried about latency of OOM at worst case.
> >> I mean that task_in_mem_cgroup calls task_lock of child.
> >> We have used task_lock in many place.
> >> Some place task_lock hold and then other locks.
> >> For example, exit_fs held task_lock and try to hold write_lock of fs->lock.
> >> If child already hold task_lock and wait to write_lock of fs->lock, OOM latency
> >> is dependent of fs->lock.
> >>
> >> I am not sure how many usecase is also dependent of other locks.
> >> If it is not as is, we can't make sure in future.
> >>
> >> So How about try_task_in_mem_cgroup?
> >> If we can't hold task_lock, let's continue next child.
> >>
> > It's recommended not to use trylock in unclear case.
> >
> > Then, I think possible replacement will be not-to-use any lock in
> > task_in_mem_cgroup. In my short consideration, I don't think task_lock
> > is necessary if we can add some tricks and memory barrier.
> >
> > Please let this patch to go as it is because this is an obvious bug fix
> > and give me time.
> 
> I think it's not only a latency problem of OOM but it is also a
> problem of deadlock.
> We can't expect child's lock state in oom_kill_process.
> 
yes.

> So if you can remove lock like below your suggestion, I am OKAY.
> 
I'll try. I don't like both mm->owner and children-kills and now they annoy me ;)

For mm, I'll prepare lockless version.
For stable tree, I'll prepare trylock version.

Thanks,
-Kame


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>

[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]