On Wed, Dec 17, 2014 at 01:58:28PM +0100, Ján Tomko wrote: > On 12/17/2014 01:21 PM, Daniel P. Berrange wrote: > > On Wed, Dec 17, 2014 at 01:18:41PM +0100, Michal Privoznik wrote: > >> On 16.12.2014 17:41, Ján Tomko wrote: > >>> The domain might disappear during the time in monitor when > >>> the virDomainObjPtr is unlocked, so the caller needs to check > >>> if it's still alive. > >>> > >>> Introduce a helper function for the callers that want to report > >>> an error in this case. > >>> --- > >>> src/qemu/qemu_domain.c | 17 +++++++++++++++++ > >>> src/qemu/qemu_domain.h | 3 +++ > >>> 2 files changed, 20 insertions(+) > >>> > >>> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c > >>> index 8dd427a..d074429 100644 > >>> --- a/src/qemu/qemu_domain.c > >>> +++ b/src/qemu/qemu_domain.c > >>> @@ -1606,6 +1606,23 @@ void qemuDomainObjExitMonitor(virQEMUDriverPtr driver, > >>> qemuDomainObjExitMonitorInternal(driver, obj); > >>> } > >>> > >>> +/* obj must NOT be locked before calling > >>> + * > >>> + * Same as qemuDomainObjExitMonitor, but reports an error and > >>> + * returns -1 if the domain is no longer alive after exiting the monitor > >> > >> Can you state explicitly that this function needs to be called whenever the > >> domain object or vm->def, or privdata (any runtime info held within vm in > >> general) is used after ExitMonitor()? > > > > Right, I can add this to qemu/THREADS.txt as well. > > > Are there any cases where that is not true ? eg could we just avoid > > introducing this new function, by making the existing funtion check > > liveliness every time. That would avoid the class of bugs you outline > > above entirely. > > I was worried about backports, but new code checking the return value would > fail to compile with the void function in older libvirt. We should add ATTRIBUTE_RETURNCHECK to force all callers to check the return status too > There are functions that only call EndJob and return after exiting the > monitor. But if the domain crashed before the API even returned, can we really > claim success? There are some APIs where we do if (running) do x else do y even with those though, I think we're justified in reporting an error if the domain stopped at any point in the 'do x' code path - after all we're already doing to be doing that if it happens while we're issuing a mointor API call. So this endjob check is only extending that window of opportunity for errors by a couple more instructions. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list