On 2012/11/1 3:44, Tejun Heo wrote: > Because ->pre_destroy() could fail and can't be called under > cgroup_mutex, cgroup destruction did something very ugly. > > 1. Grab cgroup_mutex and verify it can be destroyed; fail otherwise. > > 2. Release cgroup_mutex and call ->pre_destroy(). > > 3. Re-grab cgroup_mutex and verify it can still be destroyed; fail > otherwise. > > 4. Continue destroying. > > In addition to being ugly, it has been always broken in various ways. > For example, memcg ->pre_destroy() expects the cgroup to be inactive > after it's done but tasks can be attached and detached between #2 and > #3 and the conditions that memcg verified in ->pre_destroy() might no > longer hold by the time control reaches #3. > > Now that ->pre_destroy() is no longer allowed to fail. We can switch > to the following. > > 1. Grab cgroup_mutex and verify it can be destroyed; fail otherwise. > > 2. Deactivate CSS's and mark the cgroup removed thus preventing any > further operations which can invalidate the verification from #1. > > 3. Release cgroup_mutex and call ->pre_destroy(). > > 4. Re-grab cgroup_mutex and continue destroying. > > After this change, controllers can safely assume that ->pre_destroy() > will only be called only once for a given cgroup and, once > ->pre_destroy() is called, the cgroup will stay dormant till it's > destroyed. > > This removes the only reason ->pre_destroy() can fail - new task being > attached or child cgroup being created inbetween. Error out path is > removed and ->pre_destroy() invocation is open coded in > cgroup_rmdir(). > > v2: cgroup_call_pre_destroy() removal moved to this patch per Michal. > Commit message updated per Glauber. > > Signed-off-by: Tejun Heo <tj@xxxxxxxxxx> > Reviewed-by: Michal Hocko <mhocko@xxxxxxx> > Cc: Glauber Costa <glommer@xxxxxxxxxxxxx> Acked-by: Li Zefan <lizefan@xxxxxxxxxx> _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/containers