On 12/09/2016 03:55 AM, Nikolay Shirokovskiy wrote: > > > On 08.12.2016 19:40, John Ferlan wrote: >> >> >> On 11/24/2016 04:19 AM, Nikolay Shirokovskiy wrote: >>> qemuAgentNotifyEvent notify on a lock condition without taking >>> the lock. This works but it is a subject to race conditions. >>> --- >>> src/qemu/qemu_agent.c | 4 ++++ >>> 1 file changed, 4 insertions(+) >>> >> >> But the vm is locked prior to any priv->agent dereference and call - so >> what path could free priv->agent before we get into this NotifyEvent > > This patche does not fix NULL or dangling pointer reference... > >> code? I suppose it wouldn't hurt, but we're not entering the agent and >> the AgentEOF would require vm lock to clear agent. > > It is just we can not deliver signal reliably if lock is not taken. It is > race condition. Acessing monitor fields without lock in general is risky too. > > For example imagine we are in the process of sending agent command, vm lock > is dropped, sending thread holds agent lock and at the same time reset comes - > it is possible as nobody holds vm lock, so we come here and now we have > dangerous simultaneous access from both threads - sending and that notifies us > of a reset. OK - and this is the type of information/details that should go into the commit message. When you chase something down, leave a few bread crumbs of details. Anyway the details are, await_event is only set for qemuAgentShutdown and qemuAgentSuspend. Both have the agent lock to do so and their callers are the only ones that care. So the issue is that one of those two is called and at some time after releasing the domain lock and before setting await_event to RESET, SHUTDOWN, or SUSPEND that qemuAgentNotifyEvent is called from qemuProcessHandleReset, qemuProcessHandleShutdown, qemuProcessHandlePMSuspend, or qemuProcessHandlePMSuspendDisk which only take the vmobj lock before looking at agent fields. With that kind of detail provided in the commit message wouldn't leave me wondering what was being chased. As you saw my first assumption was the race you were referring to is the agent free race. With the extra details I think I'm able to piece together what race you are referring to. So ACK to the patch, but when you create your v2 for this series, please update the commit message with more details. Feel free to liberally steal the above paragraph. John > > Nikolay > > >> >>> diff --git a/src/qemu/qemu_agent.c b/src/qemu/qemu_agent.c >>> index 5230cbc..ad031d0 100644 >>> --- a/src/qemu/qemu_agent.c >>> +++ b/src/qemu/qemu_agent.c >>> @@ -1248,6 +1248,8 @@ qemuAgentMakeStringsArray(const char **strings, unsigned int len) >>> void qemuAgentNotifyEvent(qemuAgentPtr mon, >>> qemuAgentEvent event) >>> { >>> + virObjectLock(mon); >>> + >>> VIR_DEBUG("mon=%p event=%d await_event=%d", mon, event, mon->await_event); >>> if (mon->await_event == event) { >>> mon->await_event = QEMU_AGENT_EVENT_NONE; >>> @@ -1257,6 +1259,8 @@ void qemuAgentNotifyEvent(qemuAgentPtr mon, >>> virCondSignal(&mon->notify); >>> } >>> } >>> + >>> + virObjectUnlock(mon); >>> } >>> >>> VIR_ENUM_DECL(qemuAgentShutdownMode); >>> -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list