Re: [PATCH v2 2/9] t5526: use grep to assert on fetches

Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> · Thu, 17 Feb 2022 10:25:52 +0100

On Thu, Feb 17 2022, Glen Choo wrote:

> Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> writes:
>
>> On Wed, Feb 16 2022, Glen Choo wrote:
>>
>>> Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> writes:
>>>
>>>> On Wed, Feb 16 2022, Glen Choo wrote:
>>>>
>>>>> In a previous commit, we replaced test_cmp invocations with
>>>>> verify_fetch_result(). Finish the process of removing test_cmp by using
>>>>> grep in verify_fetch_result() instead.
>>>>>
>>>>> This makes the tests less sensitive to changes because, instead of
>>>>> checking the whole stderr, we only grep for the lines of the form
>>>>>
>>>>> * "<old-head>..<new-head>\s+branch\s+-> origin/branch"
>>>>> * "Fetching submodule <submodule-path>" (if fetching a submodule)
>>>>>
>>>>> when we expect the repo to have fetched. If we expect the repo to not
>>>>> have fetched, grep to make sure the lines are absent. Also, simplify the
>>>>> assertions by using grep patterns that match only the relevant pieces of
>>>>> information, e.g. <old-head> is irrelevant because we only want to know
>>>>> if the fetch was performed, so we don't need to know where the branch
>>>>> was before the fetch.
>>>>
>>>> I tried ejecting 1/9 and 2/9 out of this series locally, and it passes
>>>> all tests until the new tests you add in 7/9.
>>>>
>>>> As ugly as some of the pre-image is, I wonder if dropping these first
>>>> two and biting the bullet and just continuing with the test_cmp is
>>>> better.
>>>>
>>>> The test_cmp is going to catch issues that even the cleverest grep
>>>> combinations won't, e.g. in the submodule-in-C series I discovered a bug
>>>> where all of our testing & the existing series hadn't spotted that we
>>>> were dropping a \n at the end in one of the messages.
>>>
>>> I think there are two schools of thought on how to test informational
>>> messages:
>>>
>>> - assert an exact match on the exact output that we generate
>>> - assert that the output contains the pieces of information we care
>>>   about
>>>
>>> These two approaches are virtually opposites on two axes - the former
>>> will catch unintentional changes (like you've noted)[...]
>>
>> Yes, and to be fair I'm thoroughly in the "assert an exact match" camp,
>> i.e. "let's just use test_cmp", and not everyone would agree with that.
>>
>> I mean, I don't think we should test_cmp every single instance of a
>> command, but for things that are *the tests* concerning themselves with
>> what the output should be, yes we should do that.
>
> That's a good point I hadn't considered, which is that if we want any
> hope of catching unintentional changes in our test suite, we'd want
> _some_ test to check the output. For "git fetch --recurse-submodules",
> it makes the most sense for that test to live in this file.
>
> By eliminating all instances of test_cmp in this file in particular, we
> lose assurances that we don't introduce accidental changes. It makes
> sense to at least have some tests explicitly for output.
>
>>
>>> [...] and the latter saves on maintenance effort and tends to be less noisy in tests.
>>
>> I also don't think you're right about the other approach "sav[ing] on
>> [future] maintenance effort" in this case.
>>
>> If I was needing to adjust some of this output I'd spend way longer on
>> trying to carefully reason that some series of "grep" invocations were
>> really doing the right thing, and probably end up doing the equivalent
>> of a "test_cmp" for myself out of general paranoia, whereas adjusting
>> the output.
>
> That's fair. I've optimized the tests for readability by putting
> complicated logic in the test helper. But any diligent test reader would
> need to read the test helper to convince themselves of its correctness.
> In this case, I agree that the helper is too complex.
>
>>> Personally, I'm a bit torn between both approaches in general because I
>>> want tests to be maintainable (testing the exact output is a bit of an
>>> antipattern at Google), but I'm not very comfortable with the fact that
>>> unintended changes can sneak through.
>>
>> Yes, anyway whatever one thinks in general what I meant to point out
>> here with "biting the bullet" is that whatever one thinks in general
>> about the right approch for new tests, this series in particular seems
>> to be creating more work for itself than it needs by refactoring the
>> test_cmp in existing tests just to add a few new ones.
>>
>> I.e. even if you'd like to not use test_cmp-alike for the new tests,
>> wouldn't it be simpler to just leave the old ones in place and use your
>> new helper for your new tests?
>
> I'm not sure about this - avoiding changing old tests leads to
> fragmentation in the test suite and even the same file. I find it very
> challenging to read/modify files like this, because there is no longer a
> consistent style for the file, and I have to figure out which is the
> "good" way to write tests.
>
> This suggestion makes sense if there's some qualitative difference
> between the new tests and old ones besides just 'being new'. This isn't
> true for this series, so I'd prefer to keep things consistent.
>
>>> So I don't think there's a correct answer in general. Maybe an
>>> acceptable rule of thumb is that test_cmp is good until it starts
>>> getting in the way of reading and writing understandable tests.
>>>
>>> If we agree on that rule, then for this patch, I think replacing
>>> test_cmp is the way to go, primarily because it lets us ignore the 'old
>>> head' of the branch before the fetch, e.g. in the quoted example..
>>
>> [...]
>>
>>>>>  test_expect_success setup '
>>>>> @@ -274,13 +277,10 @@ test_expect_success "Recursion doesn't happen when no new commits are fetched in
>>>>>  '
>>>>>  
>>>>>  test_expect_success "Recursion stops when no new submodule commits are fetched" '
>>>>> -	head1=$(git rev-parse --short HEAD) &&
>>>>>  	git add submodule &&
>>>>>  	git commit -m "new submodule" &&
>>>>> -	head2=$(git rev-parse --short HEAD) &&
>>>>> -	echo "From $pwd/." > expect.err.super &&
>>>>> -	echo "   $head1..$head2  super      -> origin/super" >>expect.err.super &&
>>>>
>>>> ...as opposed to if we just rolled the generation of this into some
>>>> utility printf function.
>>>
>>> we'd still have to deal with $head1 if we use test_cmp. That's fine for
>>> this test, because it's pretty simple, but it gets pretty janky later
>>> on:
>>>
>>>   @@ -345,20 +339,13 @@ test_expect_success "Recursion picks up all submodules when necessary" '
>>>         git fetch &&
>>>         git checkout -q FETCH_HEAD
>>>       ) &&
>>>   -		head1=$(git rev-parse --short HEAD^) &&
>>>       git add subdir/deepsubmodule &&
>>>       git commit -m "new deepsubmodule" &&
>>>   -		head2=$(git rev-parse --short HEAD) &&
>>>   -		echo "Fetching submodule submodule" > ../expect.err.sub &&
>>>   -		echo "From $pwd/submodule" >> ../expect.err.sub &&
>>>   -		echo "   $head1..$head2  sub        -> origin/sub" >> ../expect.err.sub
>>>   +		git rev-parse --short HEAD >../subhead
>>>     ) &&
>>>   -	head1=$(git rev-parse --short HEAD) &&
>>>     git add submodule &&
>>>     git commit -m "new submodule" &&
>>>   -	head2=$(git rev-parse --short HEAD) &&
>>>   -	echo "From $pwd/." > expect.err.super &&
>>>   -	echo "   $head1..$head2  super      -> origin/super" >> expect.err.super &&
>>>   +	git rev-parse --short HEAD >superhead &&
>>>     (
>>>       cd downstream &&
>>>       git fetch >../actual.out 2>../actual.err
>>>
>>> In this example, we have two $head1 variables in different subshells,
>>> one of which is HEAD, but the other is HEAD^. The reason why we want
>>> HEAD^ isn't obvious (IIRC it's because the submodule upstream is 2
>>> commits ahead because we add the deepsubmodule in a separate commit), in
>>> my opinion, and I got tripped up quite a few times trying to read and
>>> understand the test. That's a lot of effort to spend on irrelevant
>>> information - the test actually cares about what it fetched, not where
>>> the ref used to be.
>>>
>>> So for that reason, I'd prefer to remove test_cmp for this test.
>>
>> I agree that it's pretty irrelevant, but I also think we'd be throwing
>> the baby out with the bath water by entirely doing away with test_cmp
>> here, there's an easier way to do this.
> [...]
>> Instead let's just test once somewhere that when we run submodule
>> fetching that submodules are indeed updated appropriately. Surely other
>> submodule tests will break if the "update" code is made to NOOP, or
>> update to the wrong HEAD>
>>
>> Then for all these test_cmp tests we can just sed-away the
>> $head1..$head2 with something like (untested):
>>
>>     sed -n -e 's/[^.]*\.\.[^.]*/OLD..NEW/g'
>>
>> I.e. let's just skip this entire ceremony with asserting the old/new
>> HEAD unless it's really needed (and then we can probably do it once
>> outside a test_cmp).
>>
>> If you grep through the test suite for "sed" adjacent to "test_cmp"
>> you'll find a lot of such examples of munging the output before
>> test_cmp-ing it.
>
> Makes sense, I hadn't considered this (though I have seen the pattern in
> the test suite, oops). The most compelling argument in favor of this is
> that this could remove a lot of the complexity of verify_fetch_result(),
> which is impeding test readability.
>
>> I.e. none of these tests surely need to test that we updated from
>> $head1..$head2 again and again with the corresponding verbosity in test
>> setup and shelling out to "git rev-parse --short HEAD" or whatever.
>
> I find the converse (we are testing the formatting over and over again)
> less convincing. In these tests, we really are checking for $head2 in
> the stderr to verify that the correct thing was fetched. I'm not
> convinced that we should be relying on _other_ submodule tests to tell
> us that submodule fetch is broken. Which brings me back to the original
> motivation of this patch..
>
>>
>> That's perfectly fine here, since the actual point of the test_cmp is to
>> check the formatting/order etc. of the output itself, not to continually
>> re-assert that submodule updating still works, and that we get the right
>> OIDs.
>
> which is that these tests actually are continually re-asserting the
> submodule updating works correctly in the different circumstances, and
> since we use the stderr to check this, test_cmp adds unwarranted noise.
>
> But you are correct in that the point of test_cmp is to check
> formatting/order etc. There is value in using test_cmp for this purpose,
> and getting rid of it entirely creates a hole in our test coverage.
> (This wouldn't mean that we'd need to use test_cmp _everywhere_ though,
> only that we need to use test_cmp _somewhere_.)
>
> As it stands:
>
> +   test_cmp can improve the readability of the test helpers and
>     debuggability of tests vs grep
> +/- test_cmp can catch formatting changes that are hard to catch
>     otherwise, though at the cost of being sensitive to _any_ formatting
>     changes
> -   test_cmp needs some munging to eliminate unnecessary information
>
> so on the whole, I think it's worth trying to use test_cmp in the test
> helper. We may not _need_ it everywhere, but if it would be nice to use
> it in as many places as possible.

I think whatever you opt to go for here makes sense. I just wanted to
provide the feedback in case it was helpful, i.e. when reading it I
found these conversions a bit odd, wondered if they were strictly needed
etc.

Thanks!