On Fri, Dec 05, 2014 at 12:30:17PM +0100, Jiri Denemark wrote:
On Mon, Dec 01, 2014 at 12:00:52 +0100, Martin Kletzander wrote:Whenever client socket was disconnected due to keepalive timeout, the I/O event loop did not exit and continued until the point where the hangup was found. Ending with an error right away when the keepalive times out takes care of this problem. Signed-off-by: Martin Kletzander <mkletzan@xxxxxxxxxx> --- src/rpc/virkeepalive.c | 11 ++++++----- src/rpc/virnetclient.c | 1 + 2 files changed, 7 insertions(+), 5 deletions(-) diff --git a/src/rpc/virkeepalive.c b/src/rpc/virkeepalive.c index c882313..f382f93 100644 --- a/src/rpc/virkeepalive.c +++ b/src/rpc/virkeepalive.c @@ -136,11 +136,12 @@ virKeepAliveTimerInternal(virKeepAlivePtr ka, ka, ka->client, ka->countToDeath, timeval); if (ka->countToDeath == 0) { - VIR_WARN("No response from client %p after %d keepalive messages in" - " %d seconds", - ka->client, - ka->count, - timeval); + virReportError(VIR_ERR_INTERNAL_ERROR, + _("No response from client %p after %d keepalive messages in" + " %d seconds"), + ka->client, + ka->count, + timeval); return true; } else { ka->countToDeath--; diff --git a/src/rpc/virnetclient.c b/src/rpc/virnetclient.c index 8657b0e..ac4850c 100644 --- a/src/rpc/virnetclient.c +++ b/src/rpc/virnetclient.c @@ -1525,6 +1525,7 @@ static int virNetClientIOEventLoop(virNetClientPtr client, if (virKeepAliveTrigger(client->keepalive, &msg)) { virNetClientMarkClose(client, VIR_CONNECT_CLOSE_REASON_KEEPALIVE); + goto error; } else if (msg && virNetClientQueueNonBlocking(client, msg) < 0) { VIR_WARN("Could not queue keepalive request"); virNetMessageFree(msg);I don't think this patch is quite right. The first hunk is OK, but the second one is problematic. We don't have an event loop here (sigh) and thus we use the passing the buck algorithm to process requests from several threads. So in general, in case of any error (not just keepalive timeout), we should make sure we process all possibly pending data on the socket first and let completed requests finish. Only unfinished requests should report the error. We should first finish the rest of the loop and only jump to the error path once we processed possibly completed requests. Moreover we should make sure the right error is reported by all threads, i.e., we either need to carry the original error in client and copy it from there to the thread local error object or each thread needs to report the error again (which is the current approach with "received hangup / error event on socket" message at the end of the loop). Welcome to the client's IO loop code :-)
So if I understand it properly, I cannot use virReportError(), I cannot finish the execution with any data left in the socket to read... What *can* I do? :D As you ACKed the second patch, which can go in without this one, would you be OK with it going it with the previous series as well (all together), i.e.: https://www.redhat.com/archives/libvir-list/2014-November/msg01081.html ttps://www.redhat.com/archives/libvir-list/2014-November/msg01082.html *and* 2/2 of this v2? Martin
Attachment:
signature.asc
Description: Digital signature
-- libvir-list mailing list libvir-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/libvir-list