On Fri, Nov 08, 2013 at 02:42:26PM -0500, Rich Felker wrote: > On Fri, Nov 08, 2013 at 01:30:09PM +0800, Daniel P. Berrange wrote: > > On Thu, Nov 07, 2013 at 09:15:43PM +0800, Gao feng wrote: > > > I met a problem that container blocked by seteuid/setegid > > > which is call in lxcContainerSetID on UP system and libvirt > > > compiled with --with-fuse=yes. > > > > > > I looked into the glibc's codes, and found setxid in glibc > > > calls futex() to wait for other threads to change their > > > setxid_futex to 0(see setxid_mark_thread in glibc). > > > > > > since the process created by clone system call will not > > > share the memory with the other threads and the context > > > of memory doesn't changed until we call execl.(COW) > > > > > > So if the process which created by clone is called before > > > fuse thread being stated, the new setxid_futex of fuse > > > thread will not be saw in this process, it will be blocked > > > forever. > > > > > > Maybe this problem should be fixed in glibc, but I send > > > this patch as a quick fix. > > > > Can you show a stack trace of the threads/processes deadlocking > > I think this is a symptom of setxid not being async-signal-safe like > it's required to be. I'm not sure if we have a bug tracker entry for > that; if not, it should be added. But if clone() is being used except > in a fork-like manner, this is probably invalid application usage too. We are not using clone() in a manner that is strictly equivalent to fork(). Libvirt is using clone() to create Linux containers with new namespaces. eg we do clone(CLONE_NEWPID|CLONE_NEWNS|CLONE_NEWUTS|CLONE_NEWIPC|CLONE_NEWUSER|CLONE_NEWNET|SIGCHLD) IIUC, if a process is multi-threaded you should restrict yourself to use of async signal safe functions in between fork() and exec(). I assume this restriction applies to clone() and exec() pairings too. Libvirt is in fact violating rules about only using async signal safe functions between clone() and exec() in many places. So I think what we need to do is avoid starting any threads in the parent until after we've clone()'d to create the new child namespace. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list