On 10.08.2020 20:40, Daniel Henrique Barboza wrote: > > > On 7/23/20 7:14 AM, Nikolay Shirokovskiy wrote: >> We are dropping the only reference here so that the event loop thread >> is going to be exited synchronously. In order to avoid deadlocks we >> need to unlock the VM so that any handler being called can finish >> execution and thus even loop thread be finished too. >> >> Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@xxxxxxxxxxxxx> >> --- >> src/qemu/qemu_domain.c | 18 ++++++++++++++---- >> 1 file changed, 14 insertions(+), 4 deletions(-) >> >> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c >> index 5b22eb2..82b3d11 100644 >> --- a/src/qemu/qemu_domain.c >> +++ b/src/qemu/qemu_domain.c >> @@ -1637,11 +1637,21 @@ void >> qemuDomainObjStopWorker(virDomainObjPtr dom) >> { >> qemuDomainObjPrivatePtr priv = dom->privateData; >> + virEventThread *eventThread; >> - if (priv->eventThread) { >> - g_object_unref(priv->eventThread); >> - priv->eventThread = NULL; >> - } >> + if (!priv->eventThread) >> + return; >> + >> + /* >> + * We are dropping the only reference here so that the event loop thread >> + * is going to be exited synchronously. In order to avoid deadlocks we >> + * need to unlock the VM so that any handler being called can finish >> + * execution and thus even loop thread be finished too. >> + */ >> + eventThread = g_steal_pointer(&priv->eventThread); >> + virObjectUnlock(dom); >> + g_object_unref(eventThread); >> + virObjectLock(dom); > > I understand the intention of the code thanks to the comment just before > it, but this is not robust. This unlocking -> do stuff -> lock will only > work if the caller of qemuDomainObjStopWorker() is holding the mutex. > > I see that this is the case for qemuDomainObjStopWorkerIter(), but > qemuDomainObjStopWorker() is also called by qemuProcessStop(). qemuProcessStop() > does not make any mutex lock/unlock, so you'll be assuming that all callers of > qemuProcessStop() will hold the mutex before calling it. One of them is qemuProcessInit(), > which calls qemuProcessStop() in the 'stop' jump, and there is no mutex lock > beforehand as well. > > Now we're assuming that all callers of qemuProcessInit() are holding the mutex > lock beforehand. In a quick glance in the code I saw at least 2 instances that > this isn't the case, then we'll need to assume that the callers of those functions > are holding the mutex lock. So on and so forth. > > > Given that this code only makes sense when called from qemuDomainObjStopWorkerIter(), > I'd suggest removing the lock/unlock of this function (turning it into just a call > to qemuDomainObjStopWorker()) and move them inside the body of qemuDomainObjStopWorker(), > locking and unlocking the mutex inside the scope of the same function. > Hi, Daniel. Actually all callers of qemuProcessStop hold the lock. Moreover they even hold job condition. And calling qemuProcessStop without lock/job condition would be a mistake AFIU (qemuConnectDomainXMLToNative is the only exception I see that holds the lock but not the job condition. But this function creates new vm object that is not shared with other threads) I understand you concern but there are precedents. Take a look for example virDomainObjListRemove. The lock requirements are documented for virDomainObjListRemove and I can do it for qemuDomainObjStopWorker too but it looks a bit over documenting to me. Nikolay