Re: Is t5516 somehow flakey only on macOS?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Jan 09, 2021 at 02:42:56PM -0800, Junio C Hamano wrote:

> SZEDER Gábor <szeder.dev@xxxxxxxxx> writes:
> 
> >> Commit 014ade7484 (upload-pack: send ERR packet for non-tip objects,
> >> 2019-04-13) added a test that greps the output of a failed fetch to make
> >> sure that upload-pack sent us the ERR packet we expected. But checking
> >> this is racy; despite the argument in that commit, the client may still
> >> be sending a "done" line when the server exits, causing it to die() on a
> >
> > Nit: I think using the word "after" would make the problematic
> > sequence of events a tad clearer, i.e. "... after the server has
> > exited, ...".
> 
> Thanks everybody for helping with extra clarity.  I do find it
> easier to follow with "after the server has exited" to picture
> the sequence of events in my mind.

Agreed. Here's the same patch with the commit message updated to use
"after", plus Eric's typofix.

-- >8 --
Subject: [PATCH] t5516: loosen "not our ref" error check

Commit 014ade7484 (upload-pack: send ERR packet for non-tip objects,
2019-04-13) added a test that greps the output of a failed fetch to make
sure that upload-pack sent us the ERR packet we expected. But checking
this is racy; despite the argument in that commit, the client may still
be sending a "done" line after the server exits, causing it to die() on
a failed write() and never see the ERR packet at all.

This fails quite rarely on Linux, but more often on macOS. However, it
can be triggered reliably with:

	diff --git a/fetch-pack.c b/fetch-pack.c
	index 876f90c759..cf40de9092 100644
	--- a/fetch-pack.c
	+++ b/fetch-pack.c
	@@ -489,6 +489,7 @@ static int find_common(struct fetch_negotiator *negotiator,
	 done:
	 	trace2_region_leave("fetch-pack", "negotiation_v0_v1", the_repository);
	 	if (!got_ready || !no_done) {
	+		sleep(1);
	 		packet_buf_write(&req_buf, "done\n");
	 		send_request(args, fd[1], &req_buf);
	 	}

This is a real user-visible race that it would be nice to fix, but it's
tricky to do so: the client would have to speculatively try to read an
ERR packet after hitting a write() error. And at least for this error,
it's specific to v0 (since v2 does not enforce reachability at all).

So let's loosen the test to avoid annoying racy failures. If we
eventually do the read-after-failed-write thing, we can tighten it. And
if not, v0 will grow increasingly obsolete as servers support v2, so the
utility of this test will decrease over time anyway.

Note that we can still check stderr to make sure upload-pack bailed for
the reason we expected. It writes a similar message to stderr, and
because the server side is just another process connected by pipes,
we'll reliably see it. This would not be the case for git://, or for
ssh servers that do not relay stderr (e.g., GitHub's custom endpoint
does not).

Helped-by: SZEDER Gábor <szeder.dev@xxxxxxxxx>
Signed-off-by: Jeff King <peff@xxxxxxxx>
---
 t/t5516-fetch-push.sh | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/t/t5516-fetch-push.sh b/t/t5516-fetch-push.sh
index 01004ff680..3ed121d0ce 100755
--- a/t/t5516-fetch-push.sh
+++ b/t/t5516-fetch-push.sh
@@ -1260,7 +1260,9 @@ do
 			git cat-file commit $SHA1_2 &&
 			test_must_fail env GIT_TEST_PROTOCOL_VERSION=0 \
 				git fetch ../testrepo/.git $SHA1_3 2>err &&
-			test_i18ngrep "remote error:.*not our ref.*$SHA1_3\$" err
+			# ideally we would insist this be on a "remote error:"
+			# line, but it is racy; see the commit message
+			test_i18ngrep "not our ref.*$SHA1_3\$" err
 		)
 	'
 done
-- 
2.30.0.455.g8a2dc96235




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux