On Fri, May 17, 2013 at 10:05:25PM -0600, Eric Blake wrote: > I found root cause on two nasty races today, but ran out of time to > write the patches until next week. > > 1. We have a TOCTTOU race when moving processes between cgroups. > virCgroupMoveTask can fail if listing the threads in the parent group > picks up multiple threads, then one of those threads exits before we > then write to the emulator subgroup to relocate those threads; the chain > reaction from this failure is that the domain fails to start. qemu > spawns short-lived threads all the time (possibly due to glibc using > threads under the hood to implement aio as qemu first probes a disk > image); with a 3-second-delay hook script, I was able to hit this race > about 5% of the time when repeatedly starting multiple domains in > parallel. The fix should be ignoring any ESRCH failures when writing > during virCgroupAddTAskStrController, but I plan to test that before > posting my patch. Present at least since 0.10.2 (Fedora 18); probably > earlier (but I didn't check how far back). It is worse than that - you can't simply ignore ESRCH. In the same way that existing processes can exit, new processes can come into live. The virCgroupMoveTask method doesn't take any of this into account. The real problem is that we should not be trying to move QEMU after it has been started - we should aim to get it into the right place right away. > 2. We have a (rare) deadlock due to the use of a non-async-signal-safe > getpwuid_r in between fork and exec. I was extremely surprised when I > hit it in real life while trying to debug the first race; but sure > enough, the backtrace confirms that glibc grabs a mutex inside > getpwuid_r, and if some other thread in the parent process is > interrupted in the middle of getpwuid_r while we are forking a child in > our thread, then our child is hung; and since the parent depends on a > handshake with the child, the parent's worker thread was also hung. I > think the fix to this is to call getpwuid_r prior to fork, and change > virSetUIDGID to take the list of supplemental groups as a parameter > rather than trying to learn it on the fly. Again, something I plan to > test once I have the patch written. Present at least since commit > f42cf7cb (Dec 2010) with the introduction of virSetUIDGID, and probably > further back. Hmm, interesting. Nasty issue. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list