On 27/07/17 15:17, Jeff King wrote: > On Thu, Jul 27, 2017 at 02:08:40AM +0100, Ramsay Jones wrote: > >> In order to suppress the fatal exit in this case, check the read error >> for an ECONNRESET and return as if no data was read from the daemon. >> This effectively converts an ECONNRESET into an EOF. > > Yeah, I think this is a perfectly reasonable thing to do. > > I'm not sure if we'd _ever_ see ECONNRESET on Linux. The only time I've > seen it on Linux (and I coincidentally fixed a bug just like this a week > or two ago, so it's fresh in my mind): if read() would return EOF but > there's outstanding data from a previous write, you get ECONNRESET. > Which is obviously going to be totally racy, since "outstanding" across > a TCP connection involves more than just the local kernel. > > That shouldn't happen in this 'exit' case. But even if it does, there's > really nothing the client should do differently anyway. We're waiting > for the other side to exit, and they did. Hooray. > > Though your change: > >> diff --git a/credential-cache.c b/credential-cache.c >> index 91550bfb0..1cccc3a0b 100644 >> --- a/credential-cache.c >> +++ b/credential-cache.c >> @@ -25,7 +25,7 @@ static int send_request(const char *socket, const struct strbuf *out) >> int r; >> >> r = read_in_full(fd, in, sizeof(in)); >> - if (r == 0) >> + if (r == 0 || (r < 0 && errno == ECONNRESET)) >> break; >> if (r < 0) >> die_errno("read error from cache daemon"); > > ...is in the generic send_request(). Which means that it hits all of the > commands. So if we were to send a credential and the server > crashed midway through reading it, we'd get ECONNRESET. And instead of > reporting an error, we'd quietly assume the server had no response. But > again, I don't think that's really different than the EOF behavior. The > server could read our whole request and then crash, and we'd mistakenly > think it had nothing to say. yep, I did think about adding a FLAG_EXIT (or somesuch) and passing it down through the call stack to send_request() so that I could do the check only for the 'exit' command. I decided against it in the end, obviously! ;-) > So I don't think this really changes the robustness much. If we did want > to address those cases, we'd require actual end-of-record framing in the > protocol. But I don't think it's worth caring too much about (this is, > after all, a local and internal socket between two relatively simple > programs). Adding the framing to the protocol was actually my first (very fleeting) idea. However, I didn't want to get into the 'supporting old/new client and server combos' problem. > Which is all a verbose way of saying that your patch looks good to me. > Thanks for digging up the root cause of the test failures. Thanks! ATB, Ramsay Jones