Hello, Sorry about the delay. On Tue, Feb 07, 2017 at 04:34:59PM +0100, Michal Hocko wrote: > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index c3358d4f7932..b6411816787a 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -2343,7 +2343,16 @@ void drain_local_pages(struct zone *zone) > > static void drain_local_pages_wq(struct work_struct *work) > { > + /* > + * drain_all_pages doesn't use proper cpu hotplug protection so > + * we can race with cpu offline when the WQ can move this from > + * a cpu pinned worker to an unbound one. We can operate on a different > + * cpu which is allright but we also have to make sure to not move to > + * a different one. > + */ > + preempt_disable(); > drain_local_pages(NULL); > + preempt_enable(); > } > > /* > @@ -2379,12 +2388,6 @@ void drain_all_pages(struct zone *zone) > } > > /* > - * As this can be called from reclaim context, do not reenter reclaim. > - * An allocation failure can be handled, it's simply slower > - */ > - get_online_cpus(); > - > - /* > * We don't care about racing with CPU hotplug event > * as offline notification will cause the notified > * cpu to drain that CPU pcps and on_each_cpu_mask > @@ -2423,7 +2426,6 @@ void drain_all_pages(struct zone *zone) > for_each_cpu(cpu, &cpus_with_pcps) > flush_work(per_cpu_ptr(&pcpu_drain, cpu)); > > - put_online_cpus(); > mutex_unlock(&pcpu_drain_mutex); I think this would work; however, a more canonical way would be something along the line of... drain_all_pages() { ... spin_lock(); for_each_possible_cpu() { if (this cpu should get drained) { queue_work_on(this cpu's work); } } spin_unlock(); ... } offline_hook() { spin_lock(); this cpu should get drained = false; spin_unlock(); queue_work_on(this cpu's work); flush_work(this cpu's work); } I think what workqueue should do is automatically flush in-flight CPU work items on CPU offline and erroring out on queue_work_on() on offline CPUs. And we now actually can do that because we have lifted the guarantee that queue_work() is local CPU affine some releases ago. I'll look into it soonish. For the time being, either approach should be fine. The more canonical one might be a bit less surprising but the preempt_disable/enable() change is short and sweet and completely fine for the case at hand. Please feel free to add Acked-by: Tejun Heo <tj@xxxxxxxxxx> Thanks. -- tejun -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>