On 2024/8/12 15:37, Chen Ridong wrote: > The delegatee shouldn't be allowed to write to the resource control > interface files. The kernel rejects writes to all files other than > "cgroup.procs", "cgroup.threads" and "cgroup.subtree_control" on a > namespace root from inside the namespace. However, delegatee can write > "cgroup.subtree_control" outsize of the namespace, this can be reproduced > by as follows: > > cd /sys/fs/cgroup > echo '+pids' > cgroup.subtree_control > mkdir dlgt_grp_ns > echo '+pids' > dlgt_grp_ns/cgroup.subtree_control > mkdir dlgt_grp_ns/dlgt_grp_ns1 > echo $$ > dlgt_grp_ns/dlgt_grp_ns1/cgroup.procs > echo 200 > dlgt_grp_ns/dlgt_grp_ns1/pids.max > unshare -Cm /bin/bash > echo max > dlgt_grp_ns/dlgt_grp_ns1/pids.max // Permission denied > echo -pids > dlgt_grp_ns/cgroup.subtree_control // pids was unlimited now > > We set pids.max to 200 in the cgroup dlgt_grp_ns1, and we created a new > cgroup namespace. The delegatee can't write to > dlgt_grp_ns/dlgt_grp_ns1/pids.max. However, delegatee can write to > dlgt_grp_ns/cgroup.subtree_control, which is outside of the cgroup > namespace, and this invalided the pids limitation. > > Cgroup namespaces, as delegation boundaries, should disallow the delegatee > to write all interfaces outside of the cgroup namespace. > > Fixes: 5136f6365ce3 ("cgroup: implement "nsdelegate" mount option") > Signed-off-by: Chen Ridong <chenridong@xxxxxxxxxx> > --- > kernel/cgroup/cgroup.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c > index 8dbe00000fd4..1ef9413c02e3 100644 > --- a/kernel/cgroup/cgroup.c > +++ b/kernel/cgroup/cgroup.c > @@ -4134,8 +4134,10 @@ static ssize_t cgroup_file_write(struct kernfs_open_file *of, char *buf, > * cgroup.procs, cgroup.threads and cgroup.subtree_control. > */ > if ((cgrp->root->flags & CGRP_ROOT_NS_DELEGATE) && > - !(cft->flags & CFTYPE_NS_DELEGATABLE) && > - ctx->ns != &init_cgroup_ns && ctx->ns->root_cset->dfl_cgrp == cgrp) > + ctx->ns != &init_cgroup_ns && > + (!cgroup_is_descendant(cgrp, ctx->ns->root_cset->dfl_cgrp) || > + (!(cft->flags & CFTYPE_NS_DELEGATABLE) && > + ctx->ns->root_cset->dfl_cgrp == cgrp))) Can you please match the indentation? Thanks. > return -EPERM; > > if (cft->write)