Re: mount failed while remounting different controllers

Jiufei Xue <jiufei.xue@xxxxxxxxxxxxxxxxx> · Mon, 18 Feb 2019 14:36:57 +0800

Hello, Tejun

On 2019/2/15 下午11:51, Tejun Heo wrote:
> Hello, Jiufei.
> 
> On Fri, Feb 15, 2019 at 02:19:00PM +0800, Jiufei Xue wrote:
>> The situation can be described as follows.
>> rmdir                            umount
>> cgroup_rmdir
>>  kill_css
>>   css_killed_ref_fn
>>    queue work to destroy css
>>                                  cgroup_kill_sb
>>                                   cgroup_put
>> work performed to destroy css
>>
>> Since the percpu_ref of the cgroup_root->cgrp.css hasn't been switched
>> to atomic mode, cgroup_root can't be destroyed even if not referenced
>> any more. And it will cause the remount failure in rebind_subsystems()
>> while remounting different controllers.
>>
>> I have no idea how to solve this problem. If we offline the controller
>> root when there are no alive child, it will trigger the mount hang for
>> memory controller(commit 3c606d35fe).
>>
>> Any ideas? Or should we solve this problem？
> 
> Can we just flush the workqueue before trying mount?
>
I think flushing the workqueue before trying mount is not worked because
the cgroup_root for subsystems pids and hugetlb can not be released forever.

Or do you mean flush the workqueue before umount? It also has problem.
If we flush the workqueue before umount, it can be ensured that the removed
children are offlined, but may not released. And cgroups for some controllers
such as memory can be pinned forever because of page cache.

Thanks,
Jiufei

> Thanks.
>