Re: [PATCH] mm,page_alloc: PF_WQ_WORKER threads must sleep at should_reclaim_retry().

Michal Hocko <mhocko@xxxxxxxxxx> · Mon, 30 Jul 2018 20:51:10 +0200

On Mon 30-07-18 08:44:24, Tejun Heo wrote:
> Hello,
> 
> On Tue, Jul 31, 2018 at 12:25:04AM +0900, Tetsuo Handa wrote:
> > WQ_MEM_RECLAIM guarantees that "struct task_struct" is preallocated. But
> > WQ_MEM_RECLAIM does not guarantee that the pending work is started as soon
> > as an item was queued. Same rule applies to both WQ_MEM_RECLAIM workqueues 
> > and !WQ_MEM_RECLAIM workqueues regarding when to start a pending work (i.e.
> > when schedule_timeout_*() is called).
> > 
> > Is this correct?
> 
> WQ_MEM_RECLAIM guarantees that there's always gonna exist at least one
> kworker running the workqueue.  But all per-cpu kworkers are subject
> to concurrency limiting execution - ie. if there are any per-cpu
> actively running on a cpu, no futher kworkers will be scheduled.

Well, in the ideal world we would _use_ that pre-allocated kworker if
there are no other available because they are doing something that takes
a long time to accomplish. Page allocator can spend a lot of time if we
are struggling to death to get some memory.

> > >              We can add timeout mechanism to workqueue so that it
> > > kicks off other kworkers if one of them is in running state for too
> > > long, but idk, if there's an indefinite busy loop condition in kernel
> > > threads, we really should get rid of them and hung task watchdog is
> > > pretty effective at finding these cases (at least with preemption
> > > disabled).
> > 
> > Currently the page allocator has a path which can loop forever with
> > only cond_resched().
> 
> Yeah, workqueue can choke on things like that and kthread indefinitely
> busy looping doesn't do anybody any good.

Yeah, I do agree. But this is much easier said than done ;) Sure
we have that hack that does sleep rather than cond_resched in the
page allocator. We can and will "fix" it to be unconditional in the
should_reclaim_retry [1] but this whole thing is really subtle. It just
take one misbehaving worker and something which is really important to
run will get stuck.

That being said I will post the patch with updated changelog recording
this.

[1] http://lkml.kernel.org/r/ca3da8b8-1bb5-c302-b190-fa6cebab58ca@xxxxxxxxxxxxxxxxxxx
-- 
Michal Hocko
SUSE Labs