Hi Serge and Daniel, On Sat, May 9, 2009 at 4:03 AM, Serge E. Hallyn <serue@xxxxxxxxxx> wrote: > Quoting Daniel P. Berrange (berrange@xxxxxxxxxx): >> On Fri, May 08, 2009 at 08:34:12AM -0500, Serge E. Hallyn wrote: >> > Quoting Ryota Ozaki (ozaki.ryota@xxxxxxxxx): >> > > Hi Serge, >> > > >> > > On Fri, May 8, 2009 at 11:48 AM, Serge E. Hallyn <serue@xxxxxxxxxx> wrote: >> > > > IIUC, the real problem is that src/cgroup.c assumes that the >> > > > cgroup name should be $CGROUP_MOUNTPOINT/groupname. But of >> > > > course if the ns cgroup is enabled, then the unshare(CLONE_NEWNS) >> > > > to create a new namespace in which to mount the new devpts >> > > > locks the driver under $CGROUP_MOUNTPOINT/<pid_of_driver>/ >> > > > or somesuch. >> > > > >> > > > If this fixes the problem I have no objections, but it seems >> > > > more fragile than perhaps trying to teach src/cgroup.c to >> > > > consider it's current cgroup as a starting point. >> > > >> > > hmm, I don't know why the assumption is bad and how the approach >> > > you are suggesting helps the ns problem. >> > >> > To be clear, the asssumption is that the driver starts in the >> > root cgroup, i.e. it's pid is listed in $CGROUP_MOUNTPOINT/tasks. >> > And that it can create $CGROUP_MOUNTPOINT/groupname and move >> > itself into $CGROUP_MOUNTPOINT/groupname/tasks. >> > >> > So, the assumption is bad because when the driver does a >> > unshare(CLONE_NEWNS), it gets moved into $CGROUP_MOUNTPOINT/X, >> > and after that can only move itself into >> > $CGROUP_MOUNTPOINT/X/groupname. >> > >> > Even with your patch, it's possible for the lxc driver to have >> > been started under say $CGROUP_MOUNTPOINT/libvir or >> > $CGROUP_MOUNTPOINT/<username> through libcgroup/PAM for instance, >> > in which case your patch would be insufficient. >> >> Indeed, we can't assume we're in the root group at any time. Our >> general plan is that libvirt itself uses 3 levels of cgroup >> starting at the cgroup that libvirtd was placed in by the admin >> of the host, then a per-driver group, then per-guest groups >> >> $LIBVIRTD_ROOT_CGROUP >> | >> +- lxc >> | | >> | +- LXC-GUEST-1 >> | +- LXC-GUEST-2 >> | +- LXC-GUEST-3 >> | +- ... >> | >> +- qemu >> | >> +- QEMU-GUEST-1 >> +- QEMU-GUEST-2 >> +- QEMU-GUEST-3 >> +- ... >> >> $LIBVIRTD_ROOT_CGROUP, may be the actaul root mount point for >> the cgroup controller in question, or it may be a sub-directory >> that the admin chose to put it in. Also, if running libvirtd >> as a normal user, the admin may have created per-user account >> cgroups and so libvirtd would end up in there. So we have to >> be prepared for anything wrt initial libvirtd cgroup root. >> >> NB The code for putting QEMU in a cgroup isn't merged yet. > > That sounds good. I just don't think the code currently does > it. To do the right thing, IIUC, virCgroupPathOfGroup() should > parse the /proc/pid/cgroup contents of the 'controller' process, > and insert that between the virCgroupGetMount(controller) > result and the group name. > > Or something... > > (right?) > > thanks, > -serge > OK I probably understood the problem and wrote a patch for that according to the Serge's suggestion. I expect this patch fixes the problem, however unfortunately I've not tried it yet under the situation you are worrying and just done regression tests. Because I don't know easy way to produce the situation. If you know that, please let me know... Thanks, ozaki-r >From 713be8125b5cbd3b60919afe708f9c788206a572 Mon Sep 17 00:00:00 2001 From: Ryota Ozaki <ozaki.ryota@xxxxxxxxx> Date: Sun, 10 May 2009 08:32:41 +0900 Subject: [PATCH] cgroups: address libvirtd's bad assumption for cgroups hierarchy --- src/cgroup.c | 78 ++++++++++++++++++++++++++++++++++++++++++++++++++++------ 1 files changed, 70 insertions(+), 8 deletions(-) diff --git a/src/cgroup.c b/src/cgroup.c index 50517e2..e02c37f 100644 --- a/src/cgroup.c +++ b/src/cgroup.c @@ -117,11 +117,60 @@ int virCgroupHaveSupport(void) return 0; } +static int virCgroupGetCurrentPath(int pid, char **curpath) +{ + char buf[PATH_MAX+1024]; + char *path = NULL; + int rc = 0; + int len; + int fd; + char *pos = NULL; + + if (virAsprintf(&path, "/proc/%d/cgroup", pid) == -1) { + rc = -ENOMEM; + goto error; + } + + fd = open(path, O_RDONLY); + if (fd < 0) { + DEBUG("Unable to open %s: %m", path); + rc = -errno; + goto error_open; + } + + len = saferead(fd, buf, sizeof(buf)); + if (len <= 0) { + rc = -errno; + goto error_saferead; + } + buf[len-1] = '¥0'; /* overwrite '¥n' */ + + pos = strstr(buf, ":/"); + if (pos == NULL) { + rc = -1; + goto error_strstr; + } + + pos += 2; /* skip ':/' */ + if (virAsprintf(curpath, "%s", pos) == -1) { + rc = -ENOMEM; + } + +error_strstr: +error_saferead: + close(fd); +error_open: + VIR_FREE(path); +error: + return rc; +} + static int virCgroupPathOfGroup(const char *group, const char *controller, char **path) { virCgroupPtr root = NULL; + char *curpath = NULL; int rc = 0; root = virCgroupGetMount(controller); @@ -130,8 +179,14 @@ static int virCgroupPathOfGroup(const char *group, goto out; } - if (virAsprintf(path, "%s/%s", root->path, group) == -1) + rc = virCgroupGetCurrentPath(getpid(), &curpath); + if (rc < 0) + goto out; + + if (virAsprintf(path, "%s/%s/%s", root->path, curpath, group) == -1) rc = -ENOMEM; + + VIR_FREE(curpath); out: virCgroupFree(&root); @@ -145,6 +200,7 @@ static int virCgroupPathOf(const char *grppath, virCgroupPtr root; int rc = 0; char *controller = NULL; + char *curpath = NULL; if (strchr(key, '.') == NULL) return -EINVAL; @@ -158,8 +214,14 @@ static int virCgroupPathOf(const char *grppath, goto out; } - if (virAsprintf(path, "%s/%s/%s", root->path, grppath, key) == -1) + rc = virCgroupGetCurrentPath(getpid(), &curpath); + if (rc < 0) + goto out; + + if (virAsprintf(path, "%s/%s/%s/%s", root->path, curpath, grppath, key) == -1) rc = -ENOMEM; + + VIR_FREE(curpath); out: virCgroupFree(&root); VIR_FREE(controller); @@ -625,15 +687,14 @@ int virCgroupAddTask(virCgroupPtr group, pid_t pid) char *grppath = NULL; char *taskpath = NULL; char *pidstr = NULL; + virCgroupPtr root; for (i = 0; supported_controllers[i] != NULL; i++) { - rc = virCgroupPathOfGroup(group->path, - supported_controllers[i], - &grppath); - if (rc != 0) - goto done; + root = virCgroupGetMount(supported_controllers[i]); + if (root == NULL) + continue; - if (virAsprintf(&taskpath, "%s/tasks", grppath) == -1) { + if (virAsprintf(&taskpath, "%s/%s/tasks", root->path, group->path) == -1) { rc = -ENOMEM; goto done; } @@ -655,6 +716,7 @@ int virCgroupAddTask(virCgroupPtr group, pid_t pid) } done: + virCgroupFree(&root); VIR_FREE(grppath); VIR_FREE(taskpath); VIR_FREE(pidstr); -- 1.6.0.6 -- Libvir-list mailing list Libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list