Re: [PATCH v4] mm, oom: Fix race when selecting process to kill

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Nov 12, 2013 at 12:01 PM, Oleg Nesterov <oleg@xxxxxxxxxx> wrote:
> On 11/11, Sameer Nanda wrote:
>>
>> The selection of the process to be killed happens in two spots:
>> first in select_bad_process and then a further refinement by
>> looking for child processes in oom_kill_process. Since this is
>> a two step process, it is possible that the process selected by
>> select_bad_process may get a SIGKILL just before oom_kill_process
>> executes. If this were to happen, __unhash_process deletes this
>> process from the thread_group list. This results in oom_kill_process
>> getting stuck in an infinite loop when traversing the thread_group
>> list of the selected process.
>>
>> Fix this race by adding a pid_alive check for the selected process
>> with tasklist_lock held in oom_kill_process.
>
> OK, looks correct to me. Thanks.
>
>
> Yes, this is a step backwards, hopefully we will revert this patch soon.
> I am starting to think something like while_each_thread_lame_but_safe()
> makes sense before we really fix this nasty (and afaics not simple)
> problem with with while_each_thread() (which should die).

Looking forward to a real fix for the nasty problems with
while_each_thread.  In the meanwhile, let me float one more
(hopefully, the last) version of this patch that should address
Michal's concern.  Thanks for your feedback!

>
> Oleg.
>



-- 
Sameer

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]