t5570-git-daemon fails with SIGPIPE on OSX

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Travis CI changed its default OSX image to use XCode 9.4 on 2018-07-31
[1].  Since then OSX build jobs fail rather frequently because of a
SIGPIPE in the tests 'fetch notices corrupt pack' or 'fetch notices
corrupt idx' in 't5570-git-daemon.sh' [2].  I think this is a symptom
a real bug in Git affecting other platforms as well, but these tests
are too lax to catch it.

What it boils down to is this sequence:

  - The test first prepares a repository containing a corrupt pack,
    ready to be server via 'git daemon'.

  - Then the test runs 'test_must_fail git fetch ....', which connects
    to 'git daemon', which forks 'git upload-pack', which then
    advertises refs (only HEAD) and capabilities.  So far so good.

  - 'git fetch' eventually calls fetch-pack.c:find_common().  The
    first half of this function assembles a request consisting of a
    want and a flush pkt-line, and sends it via a send_request() call.

    At this point the scheduling becomes important: let's suppose that
    fetch is slow and upload-pack is fast.

  - 'git upload-pack' receives the request, parses the want line,
    notices the corrupt pack, responds with an 'ERR upload-pack: not
    our ref' pkt-line, and die()s right away.

  - 'git fetch' finally approaches the end of the function, where it
    attempts to send a done pkt-line via another send_request() call
    through the now closing TCP socket.

  - What happens now seems to depend on the platform:

    - On Linux, both on my machine and on Travis CI, it shows textbook
      example behaviour: write() returns with error and sets errno to
      ECONNRESET.  Since it happens in write_or_die(), 'git fetch'
      die()s with 'fatal: write error: Connection reset by peer', and
      doesn't show the error send by 'git upload-pack'; how could it,
      it doesn't even get as far to receive upload-pack's ERR
      pkt-line.

      The test only checks that 'git fetch' fails, but it doesn't
      check whether it failed with the right error message, so the
      test still succeeds.  Had it checked the error message as well,
      we most likely had noticed this issue already, it doesn't happen
      all that rarely.

    - On the new OSX images with XCode 9.4 on Travis CI the write()
      triggers SIGPIPE right away, and 'test_must_fail' notices it and
      fails the test.  I couldn't see any sign of an ECONNRESET or any
      other error that we could act upon to avoid the SIGPIPE.

    - On OSX with XCode 9.2 on Travis CI there is neither SIGPIPE, nor
      ECONNRESET, but sending the request actually succeeds even
      though there is no process on the other end of the socket
      anymore.  'git fetch' then simply continues execution, reads and
      parses the ERR pkt-line, and then dies()s with 'fatal: remote
      error: upload-pack: not our ref'.  So, on the face of it, it
      shows the desired behaviour, but I have no idea how that write()
      could succeed instead of returning error.

I don't know what happens on a real Mac as I don't have access to one;
I figured out all the above by enabling packet tracing, adding a
couple of well placed tracing printf() and sleep() calls, running a
bunch of builds on Travis CI, and looking through their logs.  But
without access to a debugger and netstat and what not I can't really
go any further.  So I would now happily pass the baton to those who
have a Mac and know a thing or two about its porting issues to first
check whether OSX on a real Mac shows the same behaviour as it does in
Travis CI's virtualized(?) environment.  And then they can pass the
baton to those who know all the intricacies of the pack protocol and
its implementation to decide what to do with this issue.

For a mostly reliable reproduction recipe you might want to fetch this
branch:

  https://github.com/szeder/git t5570-git-daemon-sigpipe

and then run 'make && cd t && ./t5570-git-daemon.sh -v -x'


Have fun! ;)


1 - https://blog.travis-ci.com/2018-07-19-xcode9-4-default-announce

2 - On git.git's master:
    https://travis-ci.org/git/git/jobs/411517552#L2717



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux