[no subject]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



    credential-cache--daemon: clarify "exit" action semantics

    When this code was originally written, there wasn't much
    thought given to the timing between a client asking for
    "exit", the daemon signaling that the action is done (with
    EOF), and the actual cleanup of the socket.

    However, we need to care about this so that our test scripts
    do not end up racy (e.g., by asking for an exit and checking
    that the socket was cleaned up). The code that is already
    there happens to behave very reasonably; let's add a comment
    to make it clear that any changes should retain the same
    behavior.

So with the proposed change, t0301 will now fail racily. We need that
unlink() to happen before the fclose(). Just calling exit() does things
in the right order, but it should also be OK to do an explicit:

  delete_tempfile(&socket_file);
  fclose(in);
  fclose(out);

That would probably require making socket_file a global variable. (You
can't just return out of the serving loop, since that closes the sockets
first before deleting the tempfile).

> Clients can signal the git-credential-cache(1) daemon that it is
> supposed to exit by sending it an "exit" command. The details around
> how exactly the daemon exits seem to be rather intricate as spelt out by
> a comment surrounding our call to exit(3p), as we need to be mindful
> around closing the client streams before we signal the client.
> 
> The logic is broken on Cygwin though: when a client asks the daemon to
> exit, they won't see the EOF and will instead get an error message:
> 
>   fatal: read error from cache daemon: Software caused connection abort
> 
> This issue is known in Cygwin, see for example [1], but the exact root
> cause is not known.
> [...]
> [1]: https://github.com/cygporter/git/issues/51

I don't see any details at that issue, but I'm not sure how it would fix
things. From the client's perspective, they are going to see the
descriptor either way. Unless there is some magic that fclose() does
that a normal descriptor-close-on-exit does not do.

That "Software caused connection abort" thing seems like a weird
not-standard-Unix errno value. Searching for it mostly yields people
complaining about getting it from ssh under cygwin. :)

If the magic that cygwin needs is actually "fclose before unlink", then
that is in opposition to other platforms (and I suspect would make t0301
racy there).

> As it turns out, we can avoid the issue by explicitly closing the client
> streams via fclose(3p). I'm not sure at all where the claimed atexit(3p)
> handler mentioned in the comment is supposed to live, but from all I can
> see we do not have any installed that would close the sockets for us. So
> this leaves me with a bit of a sour taste overall.

The mention of atexit is a little oblique these days. We moved from a
manual atexit handler to using the tempfile.c handler in 9e9033166b
(credential-cache--daemon: use tempfile module, 2015-08-10).

> That being said, I couldn't spot anything obviously wrong with closing
> the streams ourselves, and it does fix the issue on Cygwin without any
> regressions on other platforms as far as I can see. So let's go for this
> fix, even though I cannot properly explain it.

Running t0301 with --stress on Linux failed for me after a minute or so.

-Peff




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux