Hi Tejun, Could you kindly look into this problem and give some advises. Or is it a design flaw of cgroup v1? Thanks a lot, Jiufei On 2019/2/18 下午2:36, Jiufei Xue wrote: > Hello, Tejun > > On 2019/2/15 下午11:51, Tejun Heo wrote: >> Hello, Jiufei. >> >> On Fri, Feb 15, 2019 at 02:19:00PM +0800, Jiufei Xue wrote: >>> The situation can be described as follows. >>> rmdir umount >>> cgroup_rmdir >>> kill_css >>> css_killed_ref_fn >>> queue work to destroy css >>> cgroup_kill_sb >>> cgroup_put >>> work performed to destroy css >>> >>> Since the percpu_ref of the cgroup_root->cgrp.css hasn't been switched >>> to atomic mode, cgroup_root can't be destroyed even if not referenced >>> any more. And it will cause the remount failure in rebind_subsystems() >>> while remounting different controllers. >>> >>> I have no idea how to solve this problem. If we offline the controller >>> root when there are no alive child, it will trigger the mount hang for >>> memory controller(commit 3c606d35fe). >>> >>> Any ideas? Or should we solve this problem? >> >> Can we just flush the workqueue before trying mount? >> > I think flushing the workqueue before trying mount is not worked because > the cgroup_root for subsystems pids and hugetlb can not be released forever. > > Or do you mean flush the workqueue before umount? It also has problem. > If we flush the workqueue before umount, it can be ensured that the removed > children are offlined, but may not released. And cgroups for some controllers > such as memory can be pinned forever because of page cache. > > Thanks, > Jiufei > >> Thanks. >>