Re: [PATCH 1/4] mm,oom: Don't call schedule_timeout_killable() with oom_lock held.

Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> · Fri, 8 Jun 2018 19:47:18 +0900

On 2018/06/07 20:28, Michal Hocko wrote:
> On Thu 07-06-18 20:00:23, Tetsuo Handa wrote:
> OK, this looks like a nice shortcut. I am quite surprise that all your
> NOMMU concerns are gone now while you clearly regress that case because
> inflight victims are not detected anymore AFAICS. Not that I care all
> that much, just sayin'.
> 
> Anyway, I would suggest splitting this into two patches. One to add an
> early check for inflight oom victims and one removing the detection from
> oom_evaluate_task. Just to make it easier to revert if somebody on nommu
> actually notices a regression.

Sure. Making it easier to revert is a good thing.
But this patch comes after PATCH 3/4. Need to solve previous problem.

On 2018/06/07 20:16, Michal Hocko wrote:
> Acked-by: Michal Hocko <mhocko@xxxxxxxx>
> 
> with a minor nit
> 
>> ---
>>  mm/oom_kill.c | 13 ++++++++-----
>>  1 file changed, 8 insertions(+), 5 deletions(-)
>>
>> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
>> index 23ce67f..5a6f1b1 100644
>> --- a/mm/oom_kill.c
>> +++ b/mm/oom_kill.c
>> @@ -1073,15 +1073,18 @@ bool out_of_memory(struct oom_control *oc)
>>  	}
>>  
>>  	select_bad_process(oc);
>> +	if (oc->chosen == (void *)-1UL)
> 
> I think this one deserves a comment.
> 	/* There is an inflight oom victim *.
> 
>> +		return true;

OK. Though, this change will be reverted by PATCH 4/4.
But this patch comes after PATCH 1/4. Need to solve previous problem.

On 2018/06/07 20:13, Michal Hocko wrote:
> On Thu 07-06-18 20:00:21, Tetsuo Handa wrote:
> 
> Your s-o-b is missing here. And I suspect this should be From: /me
> but I do not care all that much.

How can I do that? Forge the From: line (assuming that mail server does not
reject forged From: line)?

But I am quite surprised that you did not respond PATCH 2/4 with Nacked-by:
because

On 2018/06/07 20:11, Michal Hocko wrote:
> On Thu 07-06-18 20:00:20, Tetsuo Handa wrote:
> [...]
>> @@ -4238,6 +4237,12 @@ bool gfp_pfmemalloc_allowed(gfp_t gfp_mask)
>>  	/* Retry as long as the OOM killer is making progress */
>>  	if (did_some_progress) {
>>  		no_progress_loops = 0;
>> +		/*
>> +		 * This schedule_timeout_*() serves as a guaranteed sleep for
>> +		 * PF_WQ_WORKER threads when __zone_watermark_ok() == false.
>> +		 */
>> +		if (!tsk_is_oom_victim(current))
>> +			schedule_timeout_uninterruptible(1);
>>  		goto retry;
>>  	}
> 
> Nacked-by: Michal Hocko <mhocko@xxxxxxxx>
> 
> as explainaed several times already. This moving code just to preserve
> the current logic without any arguments to back them must stop finally.
> We have way too much of this "just in case" code that nobody really
> understands and others just pile on top. Seriously this is not how the
> development should work.
> 

I am purposely splitting into PATCH 1/4 and PATCH 2/4 in order to make it
easier to revert (like you suggested doing so for PATCH 4/4) in case
somebody actually notices an unexpected side effect.

PATCH 1/4 is doing logically correct thing, no matter how you hate
the short sleep which will be removed by PATCH 2/4.

PATCH 1/4 is proven to be safe. But PATCH 2/4 is not tested to be safe.
PATCH 1/4 is safe for stable. But 2/4 might not be safe for stable.
Therefore, I insist on two separate patches.

If you can accept what PATCH 1/4 + PATCH 2/4 are doing, you are free to post
one squashed patch.