On 11/13/2013 10:53 PM, Daniel P. Berrange wrote: > On Fri, Nov 08, 2013 at 02:42:26PM -0500, Rich Felker wrote: >> On Fri, Nov 08, 2013 at 01:30:09PM +0800, Daniel P. Berrange wrote: >>> On Thu, Nov 07, 2013 at 09:15:43PM +0800, Gao feng wrote: >>>> I met a problem that container blocked by seteuid/setegid >>>> which is call in lxcContainerSetID on UP system and libvirt >>>> compiled with --with-fuse=yes. >>>> >>>> I looked into the glibc's codes, and found setxid in glibc >>>> calls futex() to wait for other threads to change their >>>> setxid_futex to 0(see setxid_mark_thread in glibc). >>>> >>>> since the process created by clone system call will not >>>> share the memory with the other threads and the context >>>> of memory doesn't changed until we call execl.(COW) >>>> >>>> So if the process which created by clone is called before >>>> fuse thread being stated, the new setxid_futex of fuse >>>> thread will not be saw in this process, it will be blocked >>>> forever. >>>> >>>> Maybe this problem should be fixed in glibc, but I send >>>> this patch as a quick fix. >>> >>> Can you show a stack trace of the threads/processes deadlocking >> >> I think this is a symptom of setxid not being async-signal-safe like >> it's required to be. I'm not sure if we have a bug tracker entry for >> that; if not, it should be added. But if clone() is being used except >> in a fork-like manner, this is probably invalid application usage too. > > We are not using clone() in a manner that is strictly equivalent > to fork(). Libvirt is using clone() to create Linux containers > with new namespaces. eg we do > > clone(CLONE_NEWPID|CLONE_NEWNS|CLONE_NEWUTS|CLONE_NEWIPC|CLONE_NEWUSER|CLONE_NEWNET|SIGCHLD) > > > IIUC, if a process is multi-threaded you should restrict yourself to > use of async signal safe functions in between fork() and exec(). I > assume this restriction applies to clone() and exec() pairings too. > > Libvirt is in fact violating rules about only using async signal safe > functions between clone() and exec() in many places. So I think what > we need to do is avoid starting any threads in the parent until after > we've clone()'d to create the new child namespace. Thanks for fuse, any tring to access files exported by fuse will be blocked until the fuse thread starts do fuse_loop. I will post a update. Thanks guys. -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list