should_reclaim_retry() should be a natural reschedule point. PF_WQ_WORKER is a special case which needs a stronger rescheduling policy. Doing that unconditionally seems more straightforward than depending on a zone being a good candidate for a further reclaim. Thus, move the short sleep when we are waiting for the owner of oom_lock (which coincidentally also serves as a guaranteed sleep for PF_WQ_WORKER threads) to should_reclaim_retry(). Note that it is not evaluated that whether there is negative side effect with this change. We need to test both real and artificial workloads for evaluation. You can compare with and without this patch if you noticed something unexpected. Signed-off-by: Michal Hocko <mhocko@xxxxxxxx> Cc: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> Cc: Roman Gushchin <guro@xxxxxx> Cc: Michal Hocko <mhocko@xxxxxxxx> Cc: Johannes Weiner <hannes@xxxxxxxxxxx> Cc: Vladimir Davydov <vdavydov.dev@xxxxxxxxx> Cc: David Rientjes <rientjes@xxxxxxxxxx> Cc: Tejun Heo <tj@xxxxxxxxxx> --- mm/page_alloc.c | 40 ++++++++++++++++++---------------------- 1 file changed, 18 insertions(+), 22 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index e90f152..210a476 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -3914,6 +3914,7 @@ bool gfp_pfmemalloc_allowed(gfp_t gfp_mask) { struct zone *zone; struct zoneref *z; + bool ret = false; /* * Costly allocations might have made a progress but this doesn't mean @@ -3977,25 +3978,26 @@ bool gfp_pfmemalloc_allowed(gfp_t gfp_mask) } } - /* - * Memory allocation/reclaim might be called from a WQ - * context and the current implementation of the WQ - * concurrency control doesn't recognize that - * a particular WQ is congested if the worker thread is - * looping without ever sleeping. Therefore we have to - * do a short sleep here rather than calling - * cond_resched(). - */ - if (current->flags & PF_WQ_WORKER) - schedule_timeout_uninterruptible(1); - else - cond_resched(); - - return true; + ret = true; + goto out; } } - return false; +out: + /* + * Memory allocation/reclaim might be called from a WQ + * context and the current implementation of the WQ + * concurrency control doesn't recognize that + * a particular WQ is congested if the worker thread is + * looping without ever sleeping. Therefore we have to + * do a short sleep here rather than calling + * cond_resched(). + */ + if (current->flags & PF_WQ_WORKER) + schedule_timeout_uninterruptible(1); + else + cond_resched(); + return ret; } static inline bool @@ -4237,12 +4239,6 @@ bool gfp_pfmemalloc_allowed(gfp_t gfp_mask) /* Retry as long as the OOM killer is making progress */ if (did_some_progress) { no_progress_loops = 0; - /* - * This schedule_timeout_*() serves as a guaranteed sleep for - * PF_WQ_WORKER threads when __zone_watermark_ok() == false. - */ - if (!tsk_is_oom_victim(current)) - schedule_timeout_uninterruptible(1); goto retry; } -- 1.8.3.1