On 10/26/2015 11:37 AM, Peter Krempa wrote: > On Mon, Oct 26, 2015 at 11:06:21 -0400, John Ferlan wrote: >> If we have a shutdown of a VM by a qemu agent while an agent EOF occurs >> at relatively the same time, it's possible that a deadlock occurs depending >> on what happens first. >> >> Assume qemuProcessHandleAgentEOF runs first, with the vm->lock it clears >> priv->agent, then clears the vm->lock. > > Couldn't we make sure, that qemuProcessHandleAgentEOF clears priv->agent > if and only if it removed the last reference? reference to? agent? vm? via? qemuConnectAgent/qemuAgentOpen takes out a reference to the agent. That is un-referenced by qemuAgentClose. The EnterAgent takes out a reference which is un-referenced during ExitAgent. Adding a new reference just for EOF processing doesn't solve the problem. EOF doesn't do an agent-unlock, it does a vm-lock/unlock, and calls qemuAgentClose which will attempt an agent-lock. The agent may be locked by some other thread and it needs to wait until that reference is cleared. Adding some sort of unref-agent check logic in EOF similar to ExitAgent feels like it'll cause other issues. It seems "backwards" to remove the last reference and avoid everything else that qemuCloseAgent does. I think logically if some sort of unref-agent check logic was added, then qemuCloseAgent could not be called from either EOF or ExitAgent. >From EOF, if the current thread was the last one, then the agent is free'd so qemuCloseAgent shouldn't be called. If the current thread wasn't the last one, then we'd have to wait for the last reference check in ExitAgent, but once that is done, the agent is freed and thus qemuCloseAgent shouldn't be called. > [...] > >> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c >> index 890d8ed..a908df8 100644 >> --- a/src/qemu/qemu_domain.c >> +++ b/src/qemu/qemu_domain.c >> @@ -1868,19 +1868,20 @@ qemuDomainObjEnterAgent(virDomainObjPtr obj) >> * Should be paired with an earlier qemuDomainObjEnterAgent() call >> */ >> void >> -qemuDomainObjExitAgent(virDomainObjPtr obj) >> +qemuDomainObjExitAgent(virDomainObjPtr obj, >> + qemuAgentPtr agent) > > You would not need to modify this prototype ... > >> { >> qemuDomainObjPrivatePtr priv = obj->privateData; >> bool hasRefs; >> >> - hasRefs = virObjectUnref(priv->agent); >> + hasRefs = virObjectUnref(agent); >> >> if (hasRefs) >> - virObjectUnlock(priv->agent); >> + virObjectUnlock(agent); >> >> virObjectLock(obj); >> - VIR_DEBUG("Exited agent (agent=%p vm=%p name=%s)", >> - priv->agent, obj, obj->def->name); >> + VIR_DEBUG("Exited agent (agent=%p vm=%p name=%s hasRefs=%d)", >> + agent, obj, obj->def->name, hasRefs); >> >> priv->agentStart = 0; >> if (!hasRefs) > > >> void qemuDomainObjEnterRemote(virDomainObjPtr obj) >> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c >> index a2cc002..b8c9ff7 100644 >> --- a/src/qemu/qemu_driver.c >> +++ b/src/qemu/qemu_driver.c >> @@ -1994,9 +1994,10 @@ static int qemuDomainShutdownFlags(virDomainPtr dom, unsigned int flags) >> qemuDomainSetFakeReboot(driver, vm, isReboot); >> >> if (useAgent) { >> + qemuAgentPtr agent = priv->agent; >> qemuDomainObjEnterAgent(vm); >> - ret = qemuAgentShutdown(priv->agent, agentFlag); >> - qemuDomainObjExitAgent(vm); >> + ret = qemuAgentShutdown(agent, agentFlag); >> + qemuDomainObjExitAgent(vm, agent); > > ... and could avoid this rather ugly code here. The result would be IMO > the same. > So IYO is it reasonable to access priv->agent even though we don't own the vm-lock once EnterAgent returns? John -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list