On Tue, Jan 19, 2016 at 12:18:41PM -0500, Tejun Heo wrote: > If "cpuset.memory_migrate" is set, when a process is moved from one > cpuset to another with a different memory node mask, pages in used by > the process are migrated to the new set of nodes. This was performed > synchronously in the ->attach() callback, which is synchronized > against process management. Recently, the synchronization was changed > from per-process rwsem to global percpu rwsem for simplicity and > optimization. > > Combined with the synchronous mm migration, this led to deadlocks > because mm migration could schedule a work item which may in turn try > to create a new worker blocking on the process management lock held > from cgroup process migration path. > > This heavy an operation shouldn't be performed synchronously from that > deep inside cgroup migration in the first place. This patch punts the > actual migration to an ordered workqueue and updates cgroup process > migration and cpuset config update paths to flush the workqueue after > all locks are released. This way, the operations still seem > synchronous to userland without entangling mm migration with process > management synchronization. CPU hotplug can also invoke mm migration > but there's no reason for it to wait for mm migrations and thus > doesn't synchronize against their completions. > > Signed-off-by: Tejun Heo <tj@xxxxxxxxxx> > Reported-and-tested-by: Christian Borntraeger <borntraeger@xxxxxxxxxx> > Cc: stable@xxxxxxxxxxxxxxx # v4.4+ Applied to cgroup/for-4.5-fixes. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html