On Wed 08-02-17 11:49:11, Vlastimil Babka wrote: > On 02/07/2017 10:09 PM, Michal Hocko wrote: > > From: Michal Hocko <mhocko@xxxxxxxx> > > > > We currently have 2 specific WQ_RECLAIM workqueues. One for updating > > pcp stats vmstat_wq and one dedicated to drain per cpu lru caches. This > > seems more than necessary because both can run on a single WQ. Both > > do not block on locks requiring a memory allocation nor perform any > > allocations themselves. We will save one rescuer thread this way. > > > > On the other hand drain_all_pages queues work on the system wq which > > doesn't have rescuer and so this depend on memory allocation (when all > > workers are stuck allocating and new ones cannot be created). This is > > not critical as there should be somebody invoking the OOM killer (e.g. > > the forking worker) and get the situation unstuck and eventually > > performs the draining. Quite annoying though. This worker should be > > using WQ_RECLAIM as well. We can reuse the same one as for lru draining > > and vmstat. > > > > Suggested-by: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> > > Signed-off-by: Michal Hocko <mhocko@xxxxxxxx> > > --- > > > > Hi, > > Tetsuo has noted that drain_all_pages doesn't use WQ_RECLAIM [1] > > and asked whether we can move the worker to the vmstat_wq which is > > WQ_RECLAIM. I think the deadlock he has described shouldn't happen but > > it would be really better to have the rescuer. I also think that we do > > not really need 2 or more workqueues and also pull lru draining in. > > > > What do you think? Please note I haven't tested it yet. > > Why not, I guess, of course I may be overlooking some subtlety. You could > have CC'd Tejun and Christoph. will do on the next submission > Watch out for the init order though, maybe? Is there no caller of the > lru/pcp drain before module_init(setup_vmstat) happens? Hard to tell. I expect that there shouldn't be but I will add diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 0c0a7c38cd91..73018f07bcc9 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2370,6 +2370,13 @@ void drain_all_pages(struct zone *zone) */ static cpumask_t cpus_with_pcps; + /* + * Make sure nobody triggers this path before vmstat_wq is fully + * initialized. + */ + if (WARN_ON(!vmstat_wq)) + return; + /* Workqueues cannot recurse */ if (current->flags & PF_WQ_WORKER) return; diff --git a/mm/swap.c b/mm/swap.c index 23f09d6dd212..39c240fc9d48 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -676,6 +676,13 @@ void lru_add_drain_all(void) static struct cpumask has_work; int cpu; + /* + * Make sure nobody triggers this path before vmstat_wq is fully + * initialized. + */ + if (WARN_ON(!vmstat_wq)) + return; + mutex_lock(&lock); get_online_cpus(); cpumask_clear(&has_work); to be sure. [...] > > @@ -1763,9 +1762,11 @@ static int vmstat_cpu_dead(unsigned int cpu) > > > > static int __init setup_vmstat(void) > > { > > -#ifdef CONFIG_SMP > > - int ret; > > + int ret = 0; > > + > > + vmstat_wq = alloc_workqueue("vmstat", WQ_FREEZABLE|WQ_MEM_RECLAIM, 0); > > Did you want to set ret to -ENOMEM if the alloc fails, or something? > Otherwise I don't see why the changes. no I just didn't want to get defined but not used warning for CONFIG_SMP=n > > > > > +#ifdef CONFIG_SMP > > ret = cpuhp_setup_state_nocalls(CPUHP_MM_VMSTAT_DEAD, "mm/vmstat:dead", > > NULL, vmstat_cpu_dead); > > if (ret < 0) > > @@ -1789,7 +1790,7 @@ static int __init setup_vmstat(void) > > proc_create("vmstat", S_IRUGO, NULL, &proc_vmstat_file_operations); > > proc_create("zoneinfo", S_IRUGO, NULL, &proc_zoneinfo_file_operations); > > #endif > > - return 0; > > + return ret; > > } > > module_init(setup_vmstat) > > > > -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>