Re: [PATCHv2 5/7] cgroup: introduce cgroup namespaces

Aditya Kali <adityakali@xxxxxxxxxx> · Mon, 3 Nov 2014 15:40:11 -0800

On Fri, Oct 31, 2014 at 5:02 PM, Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
> On Fri, Oct 31, 2014 at 12:18 PM, Aditya Kali <adityakali@xxxxxxxxxx> wrote:
>> Introduce the ability to create new cgroup namespace. The newly created
>> cgroup namespace remembers the cgroup of the process at the point
>> of creation of the cgroup namespace (referred as cgroupns-root).
>> The main purpose of cgroup namespace is to virtualize the contents
>> of /proc/self/cgroup file. Processes inside a cgroup namespace
>> are only able to see paths relative to their namespace root
>> (unless they are moved outside of their cgroupns-root, at which point
>>  they will see a relative path from their cgroupns-root).
>> For a correctly setup container this enables container-tools
>> (like libcontainer, lxc, lmctfy, etc.) to create completely virtualized
>> containers without leaking system level cgroup hierarchy to the task.
>> This patch only implements the 'unshare' part of the cgroupns.
>>
>
>> +       /* Prevent cgroup changes for this task. */
>> +       threadgroup_lock(current);
>
> This could just be me being dense, but what is the lock for?
>

threadgroup_lock() is there to prevent the task from changing cgroups
while we are unsharing cgroupns.
But it seems that this might be unnecessary now because we have
removed the pinning restriction. Without pinning, we don't care if the
task cgroup changes underneath us. I will remove it from here as well
as from cgroupns_install().

>> +
>> +       /* CGROUPNS only virtualizes the cgroup path on the unified hierarchy.
>> +        */
>> +       cgrp = get_task_cgroup(current);
>> +
>> +       err = -ENOMEM;
>> +       new_ns = alloc_cgroup_ns();
>> +       if (!new_ns)
>> +               goto err_out_unlock;
>> +
>> +       err = proc_alloc_inum(&new_ns->proc_inum);
>> +       if (err)
>> +               goto err_out_unlock;
>> +
>> +       new_ns->user_ns = get_user_ns(user_ns);
>> +       new_ns->root_cgrp = cgrp;
>> +
>> +       threadgroup_unlock(current);
>> +
>> +       return new_ns;
>> +
>> +err_out_unlock:
>> +       threadgroup_unlock(current);
>> +err_out:
>> +       if (cgrp)
>> +               cgroup_put(cgrp);
>> +       kfree(new_ns);
>> +       return ERR_PTR(err);
>> +}
>> +
>> +static int cgroupns_install(struct nsproxy *nsproxy, void *ns)
>> +{
>> +       pr_info("setns not supported for cgroup namespace");
>> +       return -EINVAL;
>> +}
>> +
>> +static void *cgroupns_get(struct task_struct *task)
>> +{
>> +       struct cgroup_namespace *ns = NULL;
>> +       struct nsproxy *nsproxy;
>> +
>> +       rcu_read_lock();
>> +       nsproxy = task->nsproxy;
>> +       if (nsproxy) {
>> +               ns = nsproxy->cgroup_ns;
>> +               get_cgroup_ns(ns);
>> +       }
>> +       rcu_read_unlock();
>
> How is this correct?  Other namespaces do it too, so it Must Be
> Correct (tm), but I don't understand.  What is RCU protecting?
>
> --Andy

-- 
Aditya
--
To unsubscribe from this list: send the line "unsubscribe linux-api" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html