Re: [PATCH] t3200: fix antipatterns in existing branch tests

Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> · Mon, 21 Mar 2022 14:47:19 +0100

On Mon, Mar 21 2022, Tao Klerks via GitGitGadget wrote:

> +fetch_if_remote_ref_missing () {
> +	# this is an anti-pattern: swallows segfault
> +	#git show-ref -q "refs/remotes/$2/$1" || git fetch "$2"
> +	# this is slightly slower, up to 1s out of 6s on this set of tests:
> +	git fetch "$2"
> +	# this doesn't work
> +	#test_might_fail git show-ref -q "refs/remotes/$2/$1" || git fetch "$2"
> +}

Moving the context around a bit, as this refers to this code above:

>     I'm submitting this as RFC because I have a couple of significant
>     doubts:
>     
>      1. Does it make sense to do this? I believe it's a good idea to keep
>         things "clean" so that newcomers more easily do the right thing than
>         the wrong thing, but on the other hand, I've definitely read that we
>         have a "don't change things unnecessarily" bias somewhere.
>      2. What's the right pattern for the "(git show-ref -q
>         refs/remotes/local/main || git fetch local)" fetch-avoidance
>         optimization? Removing it adds a second to test runtimes, but Ævar
>         warned it hides segfaults

So first, 6s? Is this on Windows? I tried running this v.s. master:

    $ git hyperfine -L rev origin/master,HEAD~,HEAD -s 'make' '(cd t && ./t3200-branch.sh)'
    Benchmark 1: (cd t && ./t3200-branch.sh)' in 'origin/master
      Time (mean ± σ):      1.887 s ±  0.095 s    [User: 1.534 s, System: 0.514 s]
      Range (min … max):    1.826 s …  2.117 s    10 runs

    Benchmark 2: (cd t && ./t3200-branch.sh)' in 'HEAD~
      Time (mean ± σ):      2.132 s ±  0.013 s    [User: 1.742 s, System: 0.561 s]
      Range (min … max):    2.120 s …  2.166 s    10 runs

    Benchmark 3: (cd t && ./t3200-branch.sh)' in 'HEAD
      Time (mean ± σ):      1.944 s ±  0.005 s    [User: 1.620 s, System: 0.495 s]
      Range (min … max):    1.938 s …  1.953 s    10 runs

    Summary
      '(cd t && ./t3200-branch.sh)' in 'origin/master' ran
        1.03 ± 0.05 times faster than '(cd t && ./t3200-branch.sh)' in 'HEAD'
        1.13 ± 0.06 times faster than '(cd t && ./t3200-branch.sh)' in 'HEAD~'

The HEAD~ there is your patch here, and HEAD is my fix-up. I.e.:

	diff --git a/t/t3200-branch.sh b/t/t3200-branch.sh
	index 9bd621ed97e..7f7b3b28581 100755
	--- a/t/t3200-branch.sh
	+++ b/t/t3200-branch.sh
	@@ -17,12 +17,14 @@ test_set_remote () {
	 }

	 fetch_if_remote_ref_missing () {
	-	# this is an anti-pattern: swallows segfault
	-	#git show-ref -q "refs/remotes/$2/$1" || git fetch "$2"
	-	# this is slightly slower, up to 1s out of 6s on this set of tests:
	-	git fetch "$2"
	-	# this doesn't work
	-	#test_might_fail git show-ref -q "refs/remotes/$2/$1" || git fetch "$2"
	+	test_when_finished "rm -f ref" &&
	+	test_might_fail git rev-parse -q --verify "refs/remotes/$2/$1" >ref
	+	if ! test -s ref
	+	then
	+		# Purely an optimization, makes the test run ~10%
	+		# faster.
	+		git fetch "$2"
	+	fi
	 }

	 test_expect_success 'prepare a trivial repository' '

That's a safe way to do it that won't hide segfaults.

In *general* it's a bit painful to convert some of these, because we
really should refactor out the whole bit after "exit_code=$?" in
test_must_fail in test-lib-functions.sh into a utility
function. I.e. have the ability to run an arbitrary command, and then
after-the-fact ask if its exit code was OK.

If you'd like to refactor that that would be most welcome, and it
*would* help to convert some of these...

But in this case we can just use "rev-parse -q --verify", or rather,
nothing :)

I.e. my bias would be to just not try to optimize this, i.e. just
convert the users to the equivalent of a:

    git fetch "$2"

I.e. it's also useful to see that we behave correctly in the noop case,
and as there's no behavior difference it's a marginally more useful test
as a result.

And if you are trying to optimize this on Windows as I suspect I think
it's better to not do it. ~5s is the time it takes it just to get out of
bed in the morning as far as our test runtimes are concerned.

The real underlying issue is presumably its the shelling we'll do in
"git fetch", which we can eventually fix, and then make it approximately
the cost of the rev-parse when run locally...