Re: [PATCH] mm,oom: Don't call schedule_timeout_killable() with oom_lock held.

Michal Hocko <mhocko@xxxxxxxxxx> · Thu, 31 May 2018 20:47:21 +0200

On Fri 01-06-18 00:23:57, Tetsuo Handa wrote:
> On 2018/05/31 19:44, Michal Hocko wrote:
> > On Thu 31-05-18 19:10:48, Tetsuo Handa wrote:
> >> On 2018/05/30 8:07, Andrew Morton wrote:
> >>> On Tue, 29 May 2018 09:17:41 +0200 Michal Hocko <mhocko@xxxxxxxxxx> wrote:
> >>>
> >>>>> I suggest applying
> >>>>> this patch first, and then fix "mm, oom: cgroup-aware OOM killer" patch.
> >>>>
> >>>> Well, I hope the whole pile gets merged in the upcoming merge window
> >>>> rather than stall even more.
> >>>
> >>> I'm more inclined to drop it all.  David has identified significant
> >>> shortcomings and I'm not seeing a way of addressing those shortcomings
> >>> in a backward-compatible fashion.  Therefore there is no way forward
> >>> at present.
> >>>
> >>
> >> Can we apply my patch as-is first?
> > 
> > No. As already explained before. Sprinkling new sleeps without a strong
> > reason is not acceptable. The issue you are seeing is pretty artificial
> > and as such doesn're really warrant an immediate fix. We should rather
> > go with a well thought trhough fix. In other words we should simply drop
> > the sleep inside the oom_lock for starter unless it causes some really
> > unexpected behavior change.
> > 
> 
> The OOM killer did not require schedule_timeout_killable(1) to return
> as long as the OOM victim can call __mmput(). But now the OOM killer
> requires schedule_timeout_killable(1) to return in order to allow the
> OOM victim to call __oom_reap_task_mm(). Thus, this is a regression.
> 
> Artificial cannot become the reason to postpone my patch. If we don't care
> artificialness/maliciousness, we won't need to care Spectre/Meltdown bugs.
> 
> I'm not sprinkling new sleeps. I'm just merging existing sleeps (i.e.
> mutex_trylock() case and !mutex_trylock() case) and updating the outdated
> comments.

Sigh. So what exactly is wrong with going simple and do
http://lkml.kernel.org/r/20180528124313.GC27180@xxxxxxxxxxxxxx ?

-- 
Michal Hocko
SUSE Labs