On Mon, Dec 02, 2013 at 06:17:00PM +0100, Michal Privoznik wrote: > On 17.10.2013 14:40, Daniel P. Berrange wrote: > > On Tue, Oct 15, 2013 at 06:03:27PM +0200, Christophe Fergeau wrote: > >> Hey, > >> > >> Due to a configuration issue on my system, libvirtd is not starting on my > >> system (not complaining about this!): > >> 2013-10-15 15:40:51.024+0000: 10222: info : libvirt version: 1.1.3 > >> 2013-10-15 15:40:51.024+0000: 10222: error : virNetTLSContextCheckCertFile:117 : Cannot read CA certificate > >> '/home/teuf/usr/etc/pki/CA/cacert.pem': No such file or directory > >> > >> However, before libvirtd exits, I get another error message from the netcf > >> code, which is unexpected this time: > >> 2013-10-15 15:49:18.361+0000: 10222: error : netcfStateCleanup:105 : > >> internal error: Attempt to close netcf state driver already closed > >> > >> This message comes from the call of virStateCleanup() at the end of main() > >> in libvirtd.c. virStateCleanup() should not be called before > >> daemonStateInit() has been called in main. > >> After this call, things get more ugly as daemonStateInit() calls > >> virStateInitialize() from a thread, so there's probably a small window for > >> virStateInitialize() and virStateCleanup() running concurrently if an error > >> occurs between the call to daemonStateInit() and the call to > >> virNetServerRun(). > >> > >> I'm sending this email rather than a patch as I'm not sure what is the best > >> way to fix it. The easy way would be for virStateCleanup() to be a noop > >> when virStateInitialize() hasn't been called (iow remove the error message > >> from netcfStateCleanup). However, this would leave this small race > >> condition around (which is not that bad as it would only occurs in > >> situations when the daemon fails to start). So another approach would be to > >> set a vir_state_initialized boolean once the thread has called > >> ivrStateInitialize, and only call virStateCleanup() when it's set. > >> > >> Or maybe there's a 3rd way to fix this? > >> > >> Let me know if you have any guidance into the best way to fix this, > > > > Yeah, I've known about this issue for a while and its a bit horrible > > to solve. I think we probably need to have a global mutex to serialize > > the virStateInitialize, virStateReload and virStateCleanup calls so > > that none of them can run in parallel. > > > > I'd have virGlobalInit create the mutex instance, since we know that > > is run from thread safe context. > > > > Then make the virState* calls hold that mutex while executing. > > > > We don't want virStateCleanup to skip execution if virStateInitialize > > has failed though - every callback in virStateCleanup should be written > > to be safe if its corresponding init function hasn't run. Just the > > mutex serialization ought to be sufficient I think. > > > > > > Daniel > > > > Sorry for bringing up a dead thread, but as you both may have noticed I > was trying to solve this too: > > https://www.redhat.com/archives/libvir-list/2013-November/msg01311.html > > And I don't think Dan, your approach is gonna work. I mean, the reason > for running virStateInitialize in thread is - it may take ages to Actually no, the reason for using a thread is that the initialization code needs to have a working event loop - so you cannot run the virStateInitialize code from the same thread that is running the event loop. You're probably thinking of the QEMU VM autostart code, whicih spawns a separate thread to auto-start QEMU VMs to avoid delaying use of the QEMU driver while (slow) autostart takes place. > complete. Hence, serializing virState* will lead to main thread waiting > for the virStateInitialize to complete (which may not happen, i.e. in > case of deadlock somewhere in a driver). > > But I was thinking of rather refined locking. Turning each of > vir*DriverTab into virObjectLockable(). Then, each access would require > object to lock itself. Moreover, the virStateCleanup would unregister > the driver. Unfortunately, it could only unregister stateful drivers. This race condition scenario only hits us in two cases - libvirtd startup is being aborted due to some failure while StateInit is still running - libvirtd reload is triggered by SIGHUP causing StateReload to run In both of those cases I think it is absolutely desirable that virStateReload and virStateCleanup both block until virStateInitialize completes. The question of virStateInitialize deadlock has no bearing on this. Any such problems are simply a bug that we must fix. So I don't see why we shouldn't have a global mutex serializing the three virState{Init,Reload,Cleanup} APIs. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list