[Resending with Clemens' last used email address. Clemens, please consider sending a patch to update our .mailmap file.] On Mon, Aug 6, 2018 at 5:11 PM SZEDER Gábor <szeder.dev@xxxxxxxxx> wrote: > > Travis CI changed its default OSX image to use XCode 9.4 on 2018-07-31 > [1]. Since then OSX build jobs fail rather frequently because of a > SIGPIPE in the tests 'fetch notices corrupt pack' or 'fetch notices > corrupt idx' in 't5570-git-daemon.sh' [2]. I think this is a symptom > a real bug in Git affecting other platforms as well, but these tests > are too lax to catch it. > > What it boils down to is this sequence: > > - The test first prepares a repository containing a corrupt pack, > ready to be server via 'git daemon'. > > - Then the test runs 'test_must_fail git fetch ....', which connects > to 'git daemon', which forks 'git upload-pack', which then > advertises refs (only HEAD) and capabilities. So far so good. > > - 'git fetch' eventually calls fetch-pack.c:find_common(). The > first half of this function assembles a request consisting of a > want and a flush pkt-line, and sends it via a send_request() call. > > At this point the scheduling becomes important: let's suppose that > fetch is slow and upload-pack is fast. > > - 'git upload-pack' receives the request, parses the want line, > notices the corrupt pack, responds with an 'ERR upload-pack: not > our ref' pkt-line, and die()s right away. > > - 'git fetch' finally approaches the end of the function, where it > attempts to send a done pkt-line via another send_request() call > through the now closing TCP socket. > > - What happens now seems to depend on the platform: > > - On Linux, both on my machine and on Travis CI, it shows textbook > example behaviour: write() returns with error and sets errno to > ECONNRESET. Since it happens in write_or_die(), 'git fetch' > die()s with 'fatal: write error: Connection reset by peer', and > doesn't show the error send by 'git upload-pack'; how could it, > it doesn't even get as far to receive upload-pack's ERR > pkt-line. > > The test only checks that 'git fetch' fails, but it doesn't > check whether it failed with the right error message, so the > test still succeeds. Had it checked the error message as well, > we most likely had noticed this issue already, it doesn't happen > all that rarely. > > - On the new OSX images with XCode 9.4 on Travis CI the write() > triggers SIGPIPE right away, and 'test_must_fail' notices it and > fails the test. I couldn't see any sign of an ECONNRESET or any > other error that we could act upon to avoid the SIGPIPE. > > - On OSX with XCode 9.2 on Travis CI there is neither SIGPIPE, nor > ECONNRESET, but sending the request actually succeeds even > though there is no process on the other end of the socket > anymore. 'git fetch' then simply continues execution, reads and > parses the ERR pkt-line, and then dies()s with 'fatal: remote > error: upload-pack: not our ref'. So, on the face of it, it > shows the desired behaviour, but I have no idea how that write() > could succeed instead of returning error. > > I don't know what happens on a real Mac as I don't have access to one; > I figured out all the above by enabling packet tracing, adding a > couple of well placed tracing printf() and sleep() calls, running a > bunch of builds on Travis CI, and looking through their logs. But > without access to a debugger and netstat and what not I can't really > go any further. So I would now happily pass the baton to those who > have a Mac and know a thing or two about its porting issues to first > check whether OSX on a real Mac shows the same behaviour as it does in > Travis CI's virtualized(?) environment. And then they can pass the > baton to those who know all the intricacies of the pack protocol and > its implementation to decide what to do with this issue. > > For a mostly reliable reproduction recipe you might want to fetch this > branch: > > https://github.com/szeder/git t5570-git-daemon-sigpipe > > and then run 'make && cd t && ./t5570-git-daemon.sh -v -x' > > > Have fun! ;) > > > 1 - https://blog.travis-ci.com/2018-07-19-xcode9-4-default-announce > > 2 - On git.git's master: > https://travis-ci.org/git/git/jobs/411517552#L2717