Re: [PATCH] credential-cache: interpret an ECONNRESET as on EOF

Ramsay Jones <ramsay@xxxxxxxxxxxxxxxxxxxx> · Thu, 27 Jul 2017 15:42:36 +0100

On 27/07/17 15:17, Jeff King wrote:
> On Thu, Jul 27, 2017 at 02:08:40AM +0100, Ramsay Jones wrote:
> 
>> In order to suppress the fatal exit in this case, check the read error
>> for an ECONNRESET and return as if no data was read from the daemon.
>> This effectively converts an ECONNRESET into an EOF.
> 
> Yeah, I think this is a perfectly reasonable thing to do.
> 
> I'm not sure if we'd _ever_ see ECONNRESET on Linux. The only time I've
> seen it on Linux (and I coincidentally fixed a bug just like this a week
> or two ago, so it's fresh in my mind): if read() would return EOF but
> there's outstanding data from a previous write, you get ECONNRESET.
> Which is obviously going to be totally racy, since "outstanding" across
> a TCP connection involves more than just the local kernel.
> 
> That shouldn't happen in this 'exit' case. But even if it does, there's
> really nothing the client should do differently anyway. We're waiting
> for the other side to exit, and they did. Hooray.
> 
> Though your change:
> 
>> diff --git a/credential-cache.c b/credential-cache.c
>> index 91550bfb0..1cccc3a0b 100644
>> --- a/credential-cache.c
>> +++ b/credential-cache.c
>> @@ -25,7 +25,7 @@ static int send_request(const char *socket, const struct strbuf *out)
>>  		int r;
>>  
>>  		r = read_in_full(fd, in, sizeof(in));
>> -		if (r == 0)
>> +		if (r == 0 || (r < 0 && errno == ECONNRESET))
>>  			break;
>>  		if (r < 0)
>>  			die_errno("read error from cache daemon");
> 
> ...is in the generic send_request(). Which means that it hits all of the
> commands. So if we were to send a credential and the server
> crashed midway through reading it, we'd get ECONNRESET. And instead of
> reporting an error, we'd quietly assume the server had no response. But
> again, I don't think that's really different than the EOF behavior. The
> server could read our whole request and then crash, and we'd mistakenly
> think it had nothing to say.

yep, I did think about adding a FLAG_EXIT (or somesuch) and passing
it down through the call stack to send_request() so that I could do
the check only for the 'exit' command. I decided against it in the
end, obviously! ;-)

> So I don't think this really changes the robustness much. If we did want
> to address those cases, we'd require actual end-of-record framing in the
> protocol. But I don't think it's worth caring too much about (this is,
> after all, a local and internal socket between two relatively simple
> programs).

Adding the framing to the protocol was actually my first (very fleeting)
idea. However, I didn't want to get into the 'supporting old/new client
and server combos' problem.

> Which is all a verbose way of saying that your patch looks good to me.
> Thanks for digging up the root cause of the test failures.

Thanks!

ATB,
Ramsay Jones