Re: zram OOM behavior

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Oct 29, 2012 at 4:23 PM, Luigi Semenzato <semenzato@xxxxxxxxxx> wrote:
> On Mon, Oct 29, 2012 at 3:52 PM, David Rientjes <rientjes@xxxxxxxxxx> wrote:
>> On Mon, 29 Oct 2012, Luigi Semenzato wrote:
>>
>>> It looks like it's the final call to schedule() in do_exit():
>>>
>>>    0x81028520 <+1593>: call   0x813b68a0 <schedule>
>>>    0x81028525 <+1598>: ud2a
>>>
>>> (gdb) l *do_exit+0x63e
>>> 0x81028525 is in do_exit
>>> (/home/semenzato/trunk/src/third_party/kernel/files/kernel/exit.c:1069).
>>> 1064
>>> 1065 /* causes final put_task_struct in finish_task_switch(). */
>>> 1066 tsk->state = TASK_DEAD;
>>> 1067 tsk->flags |= PF_NOFREEZE; /* tell freezer to ignore us */
>>> 1068 schedule();
>>> 1069 BUG();
>>> 1070 /* Avoid "noreturn function does return".  */
>>> 1071 for (;;)
>>> 1072 cpu_relax(); /* For when BUG is null */
>>> 1073 }
>>>
>>
>> You're using an older kernel since the code you quoted from the oom killer
>> hasn't had the per-memcg oom kill rewrite.  There's logic that is called
>> from select_bad_process() that should exclude this thread from being
>> considered and deferred since it has a non-zero task->exit_thread, i.e. in
>> oom_scan_process_thread():
>>
>>         if (task->exit_state)
>>                 return OOM_SCAN_CONTINUE;
>>
>> And that's called from both the global oom killer and memcg oom killer.
>> So I'm thinking you're either running on an older kernel or there is no
>> oom condition at the time this is captured.


> Very sorry, I never said that we're on kernel 3.4.0.
>
> We are in a OOM-kill situation:
>
> ./arch/x86/include/asm/thread_info.h:91:#define TIF_MEMDIE 20
>
> Bit 20 in the threadinfo flags is set:
>
>> [96283.704390] chrome          x 815ecd20     0 16573   1112 0x00100104
>
> So your suggestion would be to apply OOM-related patches from a later kernel?
>
> Thanks!

Actually, I am not sure that the 3.6 OOM code is sufficiently
different to avoid this situation.  3.4 already has a test for
task->exit_state, which in my case must be failing even though
TIF_MEMDIE is set and the process has finished do_exit:

do_each_thread(g, p) {
  unsigned int points;

  if (p->exit_state)
    continue;
...

In fact, those changes look mostly cosmetic.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]