Tejun Heo wrote: > On Fri, Sep 01, 2017 at 07:07:25AM +0900, Tetsuo Handa wrote: > > cond_resched() from !PF_WQ_WORKER threads is sufficient for PF_WQ_WORKER threads to run. > > But cond_resched() is not sufficient for rescuer threads to start processing a pending work. > > An explicit scheduling (e.g. schedule_timeout_*()) by PF_WQ_WORKER threads is needed for > > rescuer threads to start processing a pending work. > > I'm not even sure this is the case. Unless I'm mistaken, in your > workqueue dumps, the available workers couldn't even leave idle which > means that they likely didn't get scheduled at all. It looks like > genuine multi minute starvation by competing direct reclaims. What's > the load number like while these events are in progress? I don't know the load number because the system is unresponsive due to global OOM. All information I can collect is via printk() from SysRq. But I guess that it is genuine multi minute starvation by competing direct reclaims, for I ran 1024 threads on 4 or 8 CPUs / 4GB RAM / no swap in order to test heavy memory pressure situation where WQ_MEM_RECLAIM mm_percpu_wq work will stay pending when I check for SysRq-t. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>