On 09.12.2016 17:42, John Ferlan wrote: > > > On 12/09/2016 03:55 AM, Nikolay Shirokovskiy wrote: >> >> >> On 08.12.2016 19:40, John Ferlan wrote: >>> >>> >>> On 11/24/2016 04:19 AM, Nikolay Shirokovskiy wrote: >>>> qemuAgentNotifyEvent notify on a lock condition without taking >>>> the lock. This works but it is a subject to race conditions. >>>> --- >>>> src/qemu/qemu_agent.c | 4 ++++ >>>> 1 file changed, 4 insertions(+) >>>> >>> >>> But the vm is locked prior to any priv->agent dereference and call - so >>> what path could free priv->agent before we get into this NotifyEvent >> >> This patche does not fix NULL or dangling pointer reference... >> >>> code? I suppose it wouldn't hurt, but we're not entering the agent and >>> the AgentEOF would require vm lock to clear agent. >> >> It is just we can not deliver signal reliably if lock is not taken. It is >> race condition. Acessing monitor fields without lock in general is risky too. >> >> For example imagine we are in the process of sending agent command, vm lock >> is dropped, sending thread holds agent lock and at the same time reset comes - >> it is possible as nobody holds vm lock, so we come here and now we have >> dangerous simultaneous access from both threads - sending and that notifies us >> of a reset. > > > OK - and this is the type of information/details that should go into the > commit message. When you chase something down, leave a few bread crumbs > of details. > > Anyway the details are, await_event is only set for qemuAgentShutdown > and qemuAgentSuspend. Both have the agent lock to do so and their > callers are the only ones that care. So the issue is that one of those > two is called and at some time after releasing the domain lock and > before setting await_event to RESET, SHUTDOWN, or SUSPEND that > qemuAgentNotifyEvent is called from qemuProcessHandleReset, > qemuProcessHandleShutdown, qemuProcessHandlePMSuspend, or > qemuProcessHandlePMSuspendDisk which only take the vmobj lock before > looking at agent fields. > > With that kind of detail provided in the commit message wouldn't leave > me wondering what was being chased. As you saw my first assumption was > the race you were referring to is the agent free race. With the extra > details I think I'm able to piece together what race you are referring > to. So ACK to the patch, but when you create your v2 for this series, > please update the commit message with more details. Feel free to > liberally steal the above paragraph. > Ok. Honestly I did not think of a detailed race screnario when I send a patch. It is just hit me that notify is called without a lock, I thought before it is incorrect code. As it turns out it is not, one can call notify without a lock. However there is a reason why it is worth taking lock anyway - http://stackoverflow.com/questions/4544234/calling-pthread-cond-signal-without-locking-mutex Nikolay >> >> >>> >>>> diff --git a/src/qemu/qemu_agent.c b/src/qemu/qemu_agent.c >>>> index 5230cbc..ad031d0 100644 >>>> --- a/src/qemu/qemu_agent.c >>>> +++ b/src/qemu/qemu_agent.c >>>> @@ -1248,6 +1248,8 @@ qemuAgentMakeStringsArray(const char **strings, unsigned int len) >>>> void qemuAgentNotifyEvent(qemuAgentPtr mon, >>>> qemuAgentEvent event) >>>> { >>>> + virObjectLock(mon); >>>> + >>>> VIR_DEBUG("mon=%p event=%d await_event=%d", mon, event, mon->await_event); >>>> if (mon->await_event == event) { >>>> mon->await_event = QEMU_AGENT_EVENT_NONE; >>>> @@ -1257,6 +1259,8 @@ void qemuAgentNotifyEvent(qemuAgentPtr mon, >>>> virCondSignal(&mon->notify); >>>> } >>>> } >>>> + >>>> + virObjectUnlock(mon); >>>> } >>>> >>>> VIR_ENUM_DECL(qemuAgentShutdownMode); >>>> -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list