On 07/09/2013 11:41 PM, Laine Stump wrote: > On 07/09/2013 09:17 PM, Eric Blake wrote: >> https://bugzilla.redhat.com/show_bug.cgi?id=964358 >> >> POSIX states that multi-threaded apps should not use functions >> that are not async-signal-safe between fork and exec, yet we >> were using getpwuid_r and initgroups. Although rare, it is >> possible to hit deadlock in the child, when it tries to grab >> a mutex that was already held by another thread in the parent. >> I actually hit this deadlock when testing multiple domains >> being started in parallel with a command hook, with the following >> backtrace in the child: >> >> +++ b/src/lxc/lxc_container.c >> @@ -1,5 +1,5 @@ >> /* >> - * Copyright (C) 2008-2012 Red Hat, Inc. >> + * Copyright (C) 2008-2013 Red Hat, Inc. >> * Copyright (C) 2008 IBM Corp. >> * >> * lxc_container.c: file description >> @@ -347,12 +347,17 @@ int lxcContainerWaitForContinue(int control) >> */ >> static int lxcContainerSetID(virDomainDefPtr def) >> { >> + gid_t *groups; >> + int ngroups; >> + >> /* Only call virSetUIDGID when user namespace is enabled >> * for this container. And user namespace is only enabled >> * when nuidmap&ngidmap is not zero */ >> >> VIR_DEBUG("Set UID/GID to 0/0"); >> - if (def->idmap.nuidmap && virSetUIDGID(0, 0) < 0) { >> + if (def->idmap.nuidmap && >> + ((ngroups = virGetGroupList(0, 0, &groups) < 0) || >> + virSetUIDGID(0, 0, groups, ngroups) < 0)) { >> virReportSystemError(errno, "%s", >> _("setuid or setgid failed")); >> return -1; [1] >> @@ -1904,7 +1928,7 @@ parenterror: >> >> /* set desired uid/gid, then attempt to create the directory */ >> >> - if (virSetUIDGID(uid, gid) < 0) { >> + if (virSetUIDGID(uid, gid, groups, ngroups) < 0) { >> ret = -errno; >> goto childerror; >> } > > > It's too late for me to be reviewing code - I can tell because I spent > several minutes being concerned that you didn't free groups in this code > path. Duh. That's really all I can say to myself :-) Only _after_ posting, did I reread my patch and your comments, and while _this_ instance doesn't need a free, I noticed that the lxc_container.c instance [1] leaks groups (at least, until exec() makes leaks irrelevant). But _that_ in turn implies that I called virGetGroupList in between clone() and exec(), which is precisely what I documented should not be done. I guess I'd better work on a followup patch that hoists the virGetGroupList to occur before the lxc clone(). -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
Attachment:
signature.asc
Description: OpenPGP digital signature
-- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list