On Thu, Jan 23, 2014 at 03:06:16PM -0500, Adam Walters wrote: > This patchset implements a tiered driver loading system. I split the hypervisor > drivers out into their own tier, which is loaded after the other drivers. This > has the net effect of ensuring that things like secrets, networks, etc., are > initialized and auto-started before any hypervisors, such as QEMU, LXC, etc. > This resolves the race condition currently present when starting libvirtd > while domains are running, which happens when restarting libvirtd after having > started at least one domain. [snip] > For anyone who is not familiar with the race condition I mention above, the > basic description is that upon restarting libvirtd, any running QEMU domains > using storage volumes are killed randomly due to their associated storage pool > not yet being online. This is due to storage pool auto-start not having > completed prior to QEMU initialization. In my prior testing, I found that this > race condition affected at least one domain approximately 40% of the time. I > sent this information to the mailing list back on 06DEC2013, if anyone is > interested in going back and re-reading my description. Ok, so the current sequence is /supposed/ to be 1 StateInitialize 1.1 StorageInitialize 1.1.1 Load all configs 1.1.2 Detect currently active pools 1.2 QEMUInitialize 1.2.1 Load all configs 1.2.2 Detect currently active guests 2 StateAutostart 2.1 StorageAutostart 2.1.1 Start any inactive pools marked autostart 2.2 QEMUAutostart 2.2.1 Start any inactive guests marked autostart Looking at the storage driver code though, I see a flaw in that 1.1.2 does not actually exist. The checking for existing active storage pools is only done in step 2.1.1 as part of autostart, which as you say is clearly too late. Also the checking for existing active storage pools looks like it only works for storage pools which actually have some persistent kernel / userspace state outside of libvirt. ie iSCSI pools will remain active even when libvirt is not running, since the iSCSI initiator is outside scope of libvirt. If we have a RBD pool, however, which is using librbd instead of the FUSE driver there's no persistent state to detect. The issue here is that the storage driver is not correctly keeping track of what storage pools were active prior to restart. ie any active storage pools before restart, should still be active after restart, *regardless* of the autostart flag. So I understand why your proposed split of driver loading works to solve this, but even with this it only solves the startup ordering problem. There's still the issue that we're reliant on the autostart flag being set. If there's a guest with an RBD volume against a pool that isn't set to autostart, we're still doomed. The storage drivers are just plain broken wrt libvirt restarts because they're not correctly recording their state and restoring back to the exact same state after restart. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list