Re: t0301-credential-cache test failure on cygwin

Ramsay Jones <ramsay@xxxxxxxxxxxxxxxxxxxx> · Thu, 7 Jul 2022 20:59:17 +0100

On 07/07/2022 19:12, Jeff King wrote:
> On Thu, Jul 07, 2022 at 02:50:21AM +0100, Ramsay Jones wrote:
> 
>> Having deleted the above patch, I now had a look at the server side. Tracing
>> out the server execution showed no surprises - everything progressed as one
>> would expect and it 'exit(0)'-ed correctly! The relevant part of the code to
>> process a client request (in the serve_one_client() function, lines 132-142
>> in builtin/credential-cache--daemon.c) looks like:
>>
>> 	else if (!strcmp(action.buf, "exit")) {
>> 		/*
>> 		 * It's important that we clean up our socket first, and then
>> 		 * signal the client only once we have finished the cleanup.
>> 		 * Calling exit() directly does this, because we clean up in
>> 		 * our atexit() handler, and then signal the client when our
>> 		 * process actually ends, which closes the socket and gives
>> 		 * them EOF.
>> 		 */
>> 		exit(0);
>> 	}
>>
>> Now, the comment doesn't make clear to me why "it's important that we clean
>> up our socket first" and, indeed, whether 'socket' refers to the socket
>> descriptor or the socket file. In the past, all of my unix-stream-socket
>> servers have closed the socket descriptor and then unlink()-ed the socket
>> file before exit(), with no 'atexit' calls in sight (lightly cribbed from a
>> 30+ years old Unix Network programming book by Stevens - or was it the Comer
>> book - or maybe the Comer and Stevens book - I forget!).
> 
> That comment refers to the socket file. If we close the handle to the
> client before we clean up the socket file, then the client may finish
> while the socket file is still there. So anybody expecting that:
> 
>   git credential-cache exit
> 
> is a sequencing operation will be fooled. One obvious thing is:
> 
>   git credential-cache exit
>   git credential-cache store <some-cred
> 
> which is now racy; the second command may try to contact the socket for
> the exiting daemon. It might actually handle that gracefully (because
> the server wouldn't actually accept()) but I didn't check. But another
> example, and the one that motivated that comment is:
> 
>   git credential-cache exit
>   test_path_is_missing $HOME/.git-credential-cache/socket
> 
> which is exactly what our tests do. ;) See the discussion around here:
> 
>   https://lore.kernel.org/git/20160318061201.GA28102@xxxxxxxxxxxxxxxxxxxxx/

I have now read (much) of that thread and it makes sense now. (Yes, I probably
should have postponed sending the email until after researching some more today;
lesson learned).

[snip]
>> So, we now have three patches which 'fix' the issue. What does this tell us?
>> Well, not an awful lot! ;-)
> 
> Of the three, I actually like the client-side one to check errno the
> best. The client is mostly "best effort". If it can't talk to the daemon
> for whatever reason, then it becomes a noop (there is nothing it can
> retrieve from the cache, and if it's trying to write, then oh well, the
> cached value was immediately expired!).
> 
> So one could argue that _every_ read error should be silently ignored.
> Calling die_errno() is mostly a nicety for debugging a broken setup, but
> in normal use, the outcome is the same either way (and Git will
> certainly ignore the exit code credential-cache anyway). I prefer the
> "ignore known harmless errors" approach, possibly because I am often the
> one debugging. ;) If ECONNABORTED is a harmless error we see in
> practice, I don't mind adding it to the list (under the same rationale
> as the current ECONNRESET that is there).

Yes, I was going to ask about ECONNRESET ... heh, no I'm kidding! :)

Yeah, if we can't determine the reason for cygwin changing behaviour
here (and fix it in cygwin), then this is probably the simplest solution.

ATB,
Ramsay Jones