[PATCH 0/3] oom: TIF_MEMDIE/PF_EXITING fixes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 03/12, Oleg Nesterov wrote:
>
> On 03/11, David Rientjes wrote:
> >
> > On Wed, 9 Mar 2011, Andrew Morton wrote:
> >
> > > If Oleg's test program cause a hang with
> > > oom-prevent-unnecessary-oom-kills-or-kernel-panics.patch and doesn't
> > > cause a hang without
> > > oom-prevent-unnecessary-oom-kills-or-kernel-panics.patch then that's a
> > > big problem for
> > > oom-prevent-unnecessary-oom-kills-or-kernel-panics.patch, no?
> > >
> >
> > It's a problem, but not because of
> > oom-prevent-unnecessary-oom-kills-or-kernel-panics.patch.
>
> It is, afaics. oom-killer can't ussume that a single PF_EXITING && p->mm
> thread is going to free the memory.
>
> > If we don't
> > have this patch, then we have a trivial panic when an oom kill occurs in a
> > cpuset with no other eligible processes, the oom killed thread group
> > leader exits
>
> It is not clear what "leader exits" actually mean. OK, perhaps you mean
> its ->mm == NULL.
>
> > but its other threads do not and they trigger oom kills
> > themselves.  for_each_process() does not iterate over these threads and so
> > it finds no eligible threads to kill and then panics
>
> Could you explain what do you mean? No need to kill these threads, they
> are already killed, we should wait until they all exit.
>
> > I'll look at Oleg's test case
> > and see what can be done to fix that condition, but the answer isn't to
> > ignore eligible threads that can be killed.
>
> Once again, they are already killed. Or I do not understand what you meant.
>
> Could you please explain the problem in more details?
>
>
> Also. Could you please look at the patches I sent?
>
> 	[PATCH 1/1] oom_kill_task: mark every thread as TIF_MEMDIE
> 	[PATCH v2 1/1] select_bad_process: improve the PF_EXITING check

Cough. And both were not right, while_each_thread(p, t) needs the properly
initialized "t". At least I warned they were not tested ;)

> Note also the note about "p == current" check. it should be fixed too.

I am resending the fixes above plus the new one.

David, Kosaki, what do you think?

Oleg.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]