On Mon, Oct 29, 2012 at 4:23 PM, Luigi Semenzato <semenzato@xxxxxxxxxx> wrote: > On Mon, Oct 29, 2012 at 3:52 PM, David Rientjes <rientjes@xxxxxxxxxx> wrote: >> On Mon, 29 Oct 2012, Luigi Semenzato wrote: >> >>> It looks like it's the final call to schedule() in do_exit(): >>> >>> 0x81028520 <+1593>: call 0x813b68a0 <schedule> >>> 0x81028525 <+1598>: ud2a >>> >>> (gdb) l *do_exit+0x63e >>> 0x81028525 is in do_exit >>> (/home/semenzato/trunk/src/third_party/kernel/files/kernel/exit.c:1069). >>> 1064 >>> 1065 /* causes final put_task_struct in finish_task_switch(). */ >>> 1066 tsk->state = TASK_DEAD; >>> 1067 tsk->flags |= PF_NOFREEZE; /* tell freezer to ignore us */ >>> 1068 schedule(); >>> 1069 BUG(); >>> 1070 /* Avoid "noreturn function does return". */ >>> 1071 for (;;) >>> 1072 cpu_relax(); /* For when BUG is null */ >>> 1073 } >>> >> >> You're using an older kernel since the code you quoted from the oom killer >> hasn't had the per-memcg oom kill rewrite. There's logic that is called >> from select_bad_process() that should exclude this thread from being >> considered and deferred since it has a non-zero task->exit_thread, i.e. in >> oom_scan_process_thread(): >> >> if (task->exit_state) >> return OOM_SCAN_CONTINUE; >> >> And that's called from both the global oom killer and memcg oom killer. >> So I'm thinking you're either running on an older kernel or there is no >> oom condition at the time this is captured. > Very sorry, I never said that we're on kernel 3.4.0. > > We are in a OOM-kill situation: > > ./arch/x86/include/asm/thread_info.h:91:#define TIF_MEMDIE 20 > > Bit 20 in the threadinfo flags is set: > >> [96283.704390] chrome x 815ecd20 0 16573 1112 0x00100104 > > So your suggestion would be to apply OOM-related patches from a later kernel? > > Thanks! Actually, I am not sure that the 3.6 OOM code is sufficiently different to avoid this situation. 3.4 already has a test for task->exit_state, which in my case must be failing even though TIF_MEMDIE is set and the process has finished do_exit: do_each_thread(g, p) { unsigned int points; if (p->exit_state) continue; ... In fact, those changes look mostly cosmetic. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>