"Daniel P. Berrange" <berrange@xxxxxxxxxx> wrote: > I noticed a odd error message randomly appearing from virsh after all > commands had been run > > # virsh dominfo VirtTest > Id: - > Name: VirtTest > UUID: 82038f2a-1344-aaf7-1a85-2a7250be2076 > OS Type: hvm > State: shut off > CPU(s): 3 > Max memory: 256000 kB > Used memory: 256000 kB > Autostart: disable > > libvir: Remote error : server closed connection > > It turns out that inthe for(;;) loop where I'm procesing incoming data, > it was forgetting to break out of the loop when a completed RPC call > had been received. Most of the time this wasn't problem, because there'd > rarely be more data following, so it'd get EAGAIN next iteration & quit > the loop. When shutting down though, you'd occassionally see the SIGHUP > from read() before you get an EAGAIN, causing this error to be raised > even though the RPC call had been successfully received. > > In addition, if there was a 2nd RPC reply already pending, we'd read and > loose some of its data. Though I never saw this happen, it is a definite > theoretical possibility, particularly over a TCP link with bad latency > or fragementation or bursty traffic. > > Daniel > > diff -r e582072116a3 src/remote_internal.c > --- a/src/remote_internal.c Mon Jan 26 16:21:42 2009 +0000 > +++ b/src/remote_internal.c Mon Jan 26 17:02:15 2009 +0000 > @@ -6135,12 +6135,27 @@ processCallRecv(virConnectPtr conn, stru > if (priv->bufferOffset == priv->bufferLength) { > if (priv->bufferOffset == 4) { > ret = processCallRecvLen(conn, priv, in_open); > + if (ret < 0) > + return -1; This part is no net change. Just hoisting the test+return that used to follow both calls. > + /* > + * We'll carry on around the loop to immediately > + * process the message body, because it has probably > + * already arrived. Worst case, we'll get EAGAIN on > + * next iteration. > + */ > } else { > ret = processCallRecvMsg(conn, priv, in_open); > priv->bufferOffset = priv->bufferLength = 0; > - } > - if (ret < 0) > - return -1; > + /* > + * We've completed one call, so return even > + * though there might still be more data on > + * the wire. We need to actually let the caller > + * deal with this arrived message to keep good > + * response, and also to correctly handle EOF. > + */ > + return ret; This seems like a good idea for all the reasons you give. Suggestions, while you're in the area: - s/EGAIN/EAGAIN/ in a comment just above, in the same function. - less important: I'd move the declaration of "ret" down into the while loop, since it's used only therein. Save 2 lines (yeah, yeah, out of 6800+ -- gotta start somewhere) ACK -- Libvir-list mailing list Libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list