On Tue, Jan 10, 2012 at 05:28:50PM +0100, Michal Privoznik wrote: > If client stream does not have any data to sink and neither received > EOF, a dummy packet is sent to the daemon signalising client is ready to > sink some data. However, after we added event loop to client a race may > occur: > > Thread 1 calls virNetClientStreamRecvPacket and since no data are cached > nor stream has EOF, it decides to send dummy packet to server which will > sent some data in turn. However, during this decision and actual message > exchange with server - > > Thread 2 receives last stream data from server. Therefore an EOF is set > on stream and if there is a call waiting (which is not yet) it is woken > up. However, Thread 1 haven't sent anything so far, so there is no call > to be woken up. So this thread sent dummy packet to daemon, which > ignores that as no stream is associated with such packet and therefore > no reply will ever come. This all sounds plausible. > --- > To get better image, here are logs: http://pastebin.com/BQEbu8Zh What is the actual error / bug behaviour that is seen in applications, without this fix ? It would be good to include that in the commit msg, since 'race condition' is fairly vague. > diff --git a/src/rpc/virnetclient.c b/src/rpc/virnetclient.c > index 469c6a5..3ea1a8f 100644 > --- a/src/rpc/virnetclient.c > +++ b/src/rpc/virnetclient.c > @@ -1672,6 +1672,7 @@ done: > */ > static int virNetClientSendInternal(virNetClientPtr client, > virNetMessagePtr msg, > + virNetClientStreamPtr st, > bool expectReply, > bool nonBlock) > { > @@ -1711,6 +1712,15 @@ static int virNetClientSendInternal(virNetClientPtr client, > goto cleanup; > } > > + /* Other thread might have already received > + * stream EOF so we don't want sent anything. > + * Server won't respond anyway. > + */ > + if (virNetClientStreamEOF(st)) { > + ret = 0; > + goto cleanup; > + } I understand your desire to put this check here, since it is thus inside the region protected by the mutex lock. I would rather we just had it in virNetClientSendWithReplyStream though. This would require moving the call to lock/unlock the mutex up one level into the callers. > + > if (virCondInit(&call->cond) < 0) { > virNetError(VIR_ERR_INTERNAL_ERROR, "%s", > _("cannot initialize condition variable")); > @@ -1757,7 +1767,7 @@ cleanup: > int virNetClientSendWithReply(virNetClientPtr client, > virNetMessagePtr msg) > { > - int ret = virNetClientSendInternal(client, msg, true, false); > + int ret = virNetClientSendInternal(client, msg, NULL, true, false); > if (ret < 0) > return -1; > return 0; > @@ -1777,7 +1787,7 @@ int virNetClientSendWithReply(virNetClientPtr client, > int virNetClientSendNoReply(virNetClientPtr client, > virNetMessagePtr msg) > { > - int ret = virNetClientSendInternal(client, msg, false, false); > + int ret = virNetClientSendInternal(client, msg, NULL, false, false); > if (ret < 0) > return -1; > return 0; > @@ -1796,5 +1806,26 @@ int virNetClientSendNoReply(virNetClientPtr client, > int virNetClientSendNonBlock(virNetClientPtr client, > virNetMessagePtr msg) > { > - return virNetClientSendInternal(client, msg, false, true); > + return virNetClientSendInternal(client, msg, NULL, false, true); > +} > + > +/* > + * @msg: a message allocated on heap or stack > + * > + * Send a message synchronously, and wait for the reply synchronously > + * > + * The caller is responsible for free'ing @msg if it was allocated > + * on the heap > + * > + * Returns 0 on success, -1 on failure > + */ > +int virNetClientSendWithReplyStream(virNetClientPtr client, > + virNetMessagePtr msg, > + virNetClientStreamPtr st) > +{ > + int ret = virNetClientSendInternal(client, msg, st, true, false); > + if (ret < 0) > + return -1; > + return 0; > } > + > diff --git a/src/rpc/virnetclient.h b/src/rpc/virnetclient.h > index 61d51e1..7c30d2b 100644 > --- a/src/rpc/virnetclient.h > +++ b/src/rpc/virnetclient.h > @@ -76,6 +76,9 @@ int virNetClientSendNoReply(virNetClientPtr client, > int virNetClientSendNonBlock(virNetClientPtr client, > virNetMessagePtr msg); > > +int virNetClientSendWithReplyStream(virNetClientPtr client, > + virNetMessagePtr msg, > + virNetClientStreamPtr st); > > # ifdef HAVE_SASL > void virNetClientSetSASLSession(virNetClientPtr client, > diff --git a/src/rpc/virnetclientstream.c b/src/rpc/virnetclientstream.c > index a4292e7..0b3e4f5 100644 > --- a/src/rpc/virnetclientstream.c > +++ b/src/rpc/virnetclientstream.c > @@ -408,7 +408,7 @@ int virNetClientStreamRecvPacket(virNetClientStreamPtr st, > > VIR_DEBUG("Dummy packet to wait for stream data"); > virMutexUnlock(&st->lock); > - ret = virNetClientSendWithReply(client, msg); > + ret = virNetClientSendWithReplyStream(client, msg, st); > virMutexLock(&st->lock); > virNetMessageFree(msg); > > @@ -530,3 +530,8 @@ cleanup: > virMutexUnlock(&st->lock); > return ret; > } > + > +bool virNetClientStreamEOF(virNetClientStreamPtr st) > +{ > + return st ? st->incomingEOF : false; > +} IMHO, the caller not pass in NULL in the first place... > diff --git a/src/rpc/virnetclientstream.h b/src/rpc/virnetclientstream.h > index 6c8d538..07a335a 100644 > --- a/src/rpc/virnetclientstream.h > +++ b/src/rpc/virnetclientstream.h > @@ -72,5 +72,6 @@ int virNetClientStreamEventUpdateCallback(virNetClientStreamPtr st, > int events); > int virNetClientStreamEventRemoveCallback(virNetClientStreamPtr st); > > +bool virNetClientStreamEOF(virNetClientStreamPtr st); meaing this should be ATTRIBUTE_NONNULL(1) Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list