On Wed, 29 Aug 2018 06:19:30 +0900 Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote: > On 2018/08/28 19:58, Michal Hocko wrote: > > On Tue 28-08-18 18:53:51, Tetsuo Handa wrote: > >> On 2018/08/28 14:48, Michal Hocko wrote: > >>> On Tue 28-08-18 09:08:09, Tetsuo Handa wrote: > >>>> Andrew Morton wrote: > >>>>> Since should_reclaim_retry() should be a natural reschedule point, let's > >>>>> do the short sleep for PF_WQ_WORKER threads unconditionally in order to > >>>>> guarantee that other pending work items are started. This will workaround > >>>>> this problem and it is less fragile than hunting down when the sleep is > >>>>> missed. E.g. we used to have a sleeping point in the oom path but this > >>>>> has been removed recently because it caused other issues. Having a single > >>>>> sleeping point is more robust. > >>>> > >>>> I think "E.g. we used to have a sleeping point in the oom path but this has > >>>> been removed recently because it caused other issues." part should be dropped, > >>>> for commit 9bfe5ded054b8e28 ("mm, oom: remove sleep from under oom_lock") is > >>>> after the OOM killer was successfully invoked whereas this patch is when the OOM > >>>> killer cannot be invoked. The problem is that we are not allowed to assume that > >>>> schedule_timeout_uninterruptible(1) in __alloc_pages_may_oom() will serve as a > >>>> guaranteed sleep for PF_WQ_WORKER threads. > >>> > >>> That remark was mostly to show how it is fragile to have a sleeping > >>> point at multiple places. > >>> > >> > >> But that remark is confusing. We removed only schedule_timeout_killable(1) > >> inside out_of_memory(). That remark sounds as if we also removed > >> schedule_timeout_uninterruptible(1) in __alloc_pages_may_oom(). Actually we > >> did not remove schedule_timeout_uninterruptible(1) in __alloc_pages_may_oom(). > >> We still have sleeping points in multiple places, including e.g. msleep() in > >> shrink_inactive_list(). I don't think that that remark is showing "how it is > >> fragile to have a sleeping point at multiple places." > > > > I do not really feel strongly about this so if you really believe this > > is more confusing than helpful then I do not mind dropping this note. > > > > OK. Andrew, would you drop "E.g. we used to have a sleeping point in the oom > path but this has been removed recently because it caused other issues." part? OK - it now reads : Since should_reclaim_retry() should be a natural reschedule point, : let's do the short sleep for PF_WQ_WORKER threads unconditionally in : order to guarantee that other pending work items are started. This : will workaround this problem and it is less fragile than hunting down : when the sleep is missed. Having a single sleeping point is more : robust.