Re: [PATCH v2] mm, oom: Fix unnecessary killing of additional processes.

Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> · Mon, 10 Sep 2018 21:52:14 +0900

On 2018/09/10 20:40, Michal Hocko wrote:
> On Mon 10-09-18 20:27:21, Tetsuo Handa wrote:
>> On 2018/09/10 18:54, Michal Hocko wrote:
>>> On Sat 08-09-18 13:54:12, Tetsuo Handa wrote:
>>> [...]
>>>
>>> I will not comment on the above because I have already done so and you
>>> keep ignoring it so I will not waste my time again.
>>
>> Then, your NACK no longer stands.
> 
> And how exactly have you reached that conclusion? Nothing has really
> changed. Except you keep pushing this crap no matter what you keep
> hearing. You obviously do not worry to put words into my mouth.

THEN, PLEASE SHOW US YOUR PATCH. WHAT YOU ARE SAYING IS ONLY "WE DON'T CARE"
AND WHAT I AND DAVID ARE SAYING IS "WE DO CARE".

> 
>>>>   (2) sysctl_oom_kill_allocating_task path can be selected forever
>>>>       because it did not check for MMF_OOM_SKIP.
>>>
>>> Why is that a problem? sysctl_oom_kill_allocating_task doesn't have any
>>> well defined semantic. It is a gross hack to save long and expensive oom
>>> victim selection. If we were too strict we should even not allow anybody
>>> else but an allocating task to be killed. So selecting it multiple times
>>> doesn't sound harmful to me.
>>
>> After current thread was selected as an OOM victim by that code path and
>> the OOM reaper reaped current thread's memory, the OOM killer has to select
>> next OOM victim,
> 
> And how have you reached this conclusion. What kind of "kill the
> allocating task" semantic really implies this?

FOR THE PROOF OF "THE FORWARD PROGRESS GUARANTEE".

>>>>   (3) CONFIG_MMU=n kernels might livelock because nobody except
>>>>       is_global_init() case in __oom_kill_process() sets MMF_OOM_SKIP.
>>>
>>> And now the obligatory question. Is this a real problem?
>>
>> I SAID "POSSIBLE BUGS". You have never heard is not a proof that the problem
>> is not occurring in the world. Not everybody is skillful enough to report
>> OOM (or low memory) problems to you!
> 
> No, we are not making the code overly complex or convoluted for
> theoretically possible issues we have never heard before.

FOR THE PROOF OF "THE FORWARD PROGRESS GUARANTEE".

>>>> which prevent proof of "the forward progress guarantee"
>>>> and adds one optimization
>>>>
>>>>   (4) oom_evaluate_task() was calling oom_unkillable_task() twice because
>>>>       oom_evaluate_task() needs to check for !MMF_OOM_SKIP and
>>>>       oom_task_origin() tasks before calling oom_badness().
>>>
>>> ENOPARSE
>>>
>>
>> Not difficult to parse at all.
>>
>> oom_evaluate_task() {
>>
>>   if (oom_unkillable_task(task, NULL, oc->nodemask))
>>     goto next;
>>
>>   if (!is_sysrq_oom(oc) && tsk_is_oom_victim(task)) {
>>     if (test_bit(MMF_OOM_SKIP, &task->signal->oom_mm->flags))
>>       goto next;
>>     goto abort;
>>   }
>>
>>   if (oom_task_origin(task)) {
>>     points = ULONG_MAX;
>>     goto select;
>>   }
>>
>>   points = oom_badness(task, NULL, oc->nodemask, oc->totalpages) {
>>
>>     if (oom_unkillable_task(p, memcg, nodemask))
>>       return 0;
>>
>>   }
>> }
>>
>> By moving oom_task_origin() to inside oom_badness(), and
>> by bringing !MMF_OOM_SKIP test earlier, we can eliminate
>>
>>   oom_unkillable_task(task, NULL, oc->nodemask)
>>
>> test in oom_evaluate_task().
> 
> And so what?
> 

WE CAN MAKE THE RCU PROTECTED SECTION SHORTER. IT IS 99% WASTE TO TEST
oom_unkillable_task(task, NULL, oc->nodemask) FOR UNLIKELY MMF_OOM_SKIP
OR oom_task_origin() CASES.