Re: [PATCH v2 2/9] t5526: use grep to assert on fetches

Glen Choo <chooglen@xxxxxxxxxx> · Fri, 18 Feb 2022 00:16:44 +0800

Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> writes:

> On Thu, Feb 17 2022, Glen Choo wrote:
>
>> Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> writes:
>>
>>> On Wed, Feb 16 2022, Glen Choo wrote:
>>>
>>>> Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> writes:
>>>>
>>>>> On Wed, Feb 16 2022, Glen Choo wrote:
>>>>>
>>>>>> In a previous commit, we replaced test_cmp invocations with
>>>>>> verify_fetch_result(). Finish the process of removing test_cmp by using
>>>>>> grep in verify_fetch_result() instead.
>>>>>>
>>>>>> This makes the tests less sensitive to changes because, instead of
>>>>>> checking the whole stderr, we only grep for the lines of the form
>>>>>>
>>>>>> * "<old-head>..<new-head>\s+branch\s+-> origin/branch"
>>>>>> * "Fetching submodule <submodule-path>" (if fetching a submodule)
>>>>>>
>>>>>> when we expect the repo to have fetched. If we expect the repo to not
>>>>>> have fetched, grep to make sure the lines are absent. Also, simplify the
>>>>>> assertions by using grep patterns that match only the relevant pieces of
>>>>>> information, e.g. <old-head> is irrelevant because we only want to know
>>>>>> if the fetch was performed, so we don't need to know where the branch
>>>>>> was before the fetch.
>>>>>
>>>>> I tried ejecting 1/9 and 2/9 out of this series locally, and it passes
>>>>> all tests until the new tests you add in 7/9.
>>>>>
>>>>> As ugly as some of the pre-image is, I wonder if dropping these first
>>>>> two and biting the bullet and just continuing with the test_cmp is
>>>>> better.
>>>>>
>>>>> The test_cmp is going to catch issues that even the cleverest grep
>>>>> combinations won't, e.g. in the submodule-in-C series I discovered a bug
>>>>> where all of our testing & the existing series hadn't spotted that we
>>>>> were dropping a \n at the end in one of the messages.
>>>>
>>>> I think there are two schools of thought on how to test informational
>>>> messages:
>>>>
>>>> - assert an exact match on the exact output that we generate
>>>> - assert that the output contains the pieces of information we care
>>>>   about
>>>>
>>>> These two approaches are virtually opposites on two axes - the former
>>>> will catch unintentional changes (like you've noted)[...]
>>>
>>> Yes, and to be fair I'm thoroughly in the "assert an exact match" camp,
>>> i.e. "let's just use test_cmp", and not everyone would agree with that.
>>>
>>> I mean, I don't think we should test_cmp every single instance of a
>>> command, but for things that are *the tests* concerning themselves with
>>> what the output should be, yes we should do that.
>>
>> That's a good point I hadn't considered, which is that if we want any
>> hope of catching unintentional changes in our test suite, we'd want
>> _some_ test to check the output. For "git fetch --recurse-submodules",
>> it makes the most sense for that test to live in this file.
>>
>> By eliminating all instances of test_cmp in this file in particular, we
>> lose assurances that we don't introduce accidental changes. It makes
>> sense to at least have some tests explicitly for output.
>>
>>>
>>>> [...] and the latter saves on maintenance effort and tends to be less noisy in tests.
>>>
>>> I also don't think you're right about the other approach "sav[ing] on
>>> [future] maintenance effort" in this case.
>>>
>>> If I was needing to adjust some of this output I'd spend way longer on
>>> trying to carefully reason that some series of "grep" invocations were
>>> really doing the right thing, and probably end up doing the equivalent
>>> of a "test_cmp" for myself out of general paranoia, whereas adjusting
>>> the output.
>>
>> That's fair. I've optimized the tests for readability by putting
>> complicated logic in the test helper. But any diligent test reader would
>> need to read the test helper to convince themselves of its correctness.
>> In this case, I agree that the helper is too complex.
>>
>>>> Personally, I'm a bit torn between both approaches in general because I
>>>> want tests to be maintainable (testing the exact output is a bit of an
>>>> antipattern at Google), but I'm not very comfortable with the fact that
>>>> unintended changes can sneak through.
>>>
>>> Yes, anyway whatever one thinks in general what I meant to point out
>>> here with "biting the bullet" is that whatever one thinks in general
>>> about the right approch for new tests, this series in particular seems
>>> to be creating more work for itself than it needs by refactoring the
>>> test_cmp in existing tests just to add a few new ones.
>>>
>>> I.e. even if you'd like to not use test_cmp-alike for the new tests,
>>> wouldn't it be simpler to just leave the old ones in place and use your
>>> new helper for your new tests?
>>
>> I'm not sure about this - avoiding changing old tests leads to
>> fragmentation in the test suite and even the same file. I find it very
>> challenging to read/modify files like this, because there is no longer a
>> consistent style for the file, and I have to figure out which is the
>> "good" way to write tests.
>>
>> This suggestion makes sense if there's some qualitative difference
>> between the new tests and old ones besides just 'being new'. This isn't
>> true for this series, so I'd prefer to keep things consistent.
>>
>>>> So I don't think there's a correct answer in general. Maybe an
>>>> acceptable rule of thumb is that test_cmp is good until it starts
>>>> getting in the way of reading and writing understandable tests.
>>>>
>>>> If we agree on that rule, then for this patch, I think replacing
>>>> test_cmp is the way to go, primarily because it lets us ignore the 'old
>>>> head' of the branch before the fetch, e.g. in the quoted example..
>>>
>>> [...]
>>>
>>>>>>  test_expect_success setup '
>>>>>> @@ -274,13 +277,10 @@ test_expect_success "Recursion doesn't happen when no new commits are fetched in
>>>>>>  '
>>>>>>  
>>>>>>  test_expect_success "Recursion stops when no new submodule commits are fetched" '
>>>>>> -	head1=$(git rev-parse --short HEAD) &&
>>>>>>  	git add submodule &&
>>>>>>  	git commit -m "new submodule" &&
>>>>>> -	head2=$(git rev-parse --short HEAD) &&
>>>>>> -	echo "From $pwd/." > expect.err.super &&
>>>>>> -	echo "   $head1..$head2  super      -> origin/super" >>expect.err.super &&
>>>>>
>>>>> ...as opposed to if we just rolled the generation of this into some
>>>>> utility printf function.
>>>>
>>>> we'd still have to deal with $head1 if we use test_cmp. That's fine for
>>>> this test, because it's pretty simple, but it gets pretty janky later
>>>> on:
>>>>
>>>>   @@ -345,20 +339,13 @@ test_expect_success "Recursion picks up all submodules when necessary" '
>>>>         git fetch &&
>>>>         git checkout -q FETCH_HEAD
>>>>       ) &&
>>>>   -		head1=$(git rev-parse --short HEAD^) &&
>>>>       git add subdir/deepsubmodule &&
>>>>       git commit -m "new deepsubmodule" &&
>>>>   -		head2=$(git rev-parse --short HEAD) &&
>>>>   -		echo "Fetching submodule submodule" > ../expect.err.sub &&
>>>>   -		echo "From $pwd/submodule" >> ../expect.err.sub &&
>>>>   -		echo "   $head1..$head2  sub        -> origin/sub" >> ../expect.err.sub
>>>>   +		git rev-parse --short HEAD >../subhead
>>>>     ) &&
>>>>   -	head1=$(git rev-parse --short HEAD) &&
>>>>     git add submodule &&
>>>>     git commit -m "new submodule" &&
>>>>   -	head2=$(git rev-parse --short HEAD) &&
>>>>   -	echo "From $pwd/." > expect.err.super &&
>>>>   -	echo "   $head1..$head2  super      -> origin/super" >> expect.err.super &&
>>>>   +	git rev-parse --short HEAD >superhead &&
>>>>     (
>>>>       cd downstream &&
>>>>       git fetch >../actual.out 2>../actual.err
>>>>
>>>> In this example, we have two $head1 variables in different subshells,
>>>> one of which is HEAD, but the other is HEAD^. The reason why we want
>>>> HEAD^ isn't obvious (IIRC it's because the submodule upstream is 2
>>>> commits ahead because we add the deepsubmodule in a separate commit), in
>>>> my opinion, and I got tripped up quite a few times trying to read and
>>>> understand the test. That's a lot of effort to spend on irrelevant
>>>> information - the test actually cares about what it fetched, not where
>>>> the ref used to be.
>>>>
>>>> So for that reason, I'd prefer to remove test_cmp for this test.
>>>
>>> I agree that it's pretty irrelevant, but I also think we'd be throwing
>>> the baby out with the bath water by entirely doing away with test_cmp
>>> here, there's an easier way to do this.
>> [...]
>>> Instead let's just test once somewhere that when we run submodule
>>> fetching that submodules are indeed updated appropriately. Surely other
>>> submodule tests will break if the "update" code is made to NOOP, or
>>> update to the wrong HEAD>
>>>
>>> Then for all these test_cmp tests we can just sed-away the
>>> $head1..$head2 with something like (untested):
>>>
>>>     sed -n -e 's/[^.]*\.\.[^.]*/OLD..NEW/g'
>>>
>>> I.e. let's just skip this entire ceremony with asserting the old/new
>>> HEAD unless it's really needed (and then we can probably do it once
>>> outside a test_cmp).
>>>
>>> If you grep through the test suite for "sed" adjacent to "test_cmp"
>>> you'll find a lot of such examples of munging the output before
>>> test_cmp-ing it.
>>
>> Makes sense, I hadn't considered this (though I have seen the pattern in
>> the test suite, oops). The most compelling argument in favor of this is
>> that this could remove a lot of the complexity of verify_fetch_result(),
>> which is impeding test readability.
>>
>>> I.e. none of these tests surely need to test that we updated from
>>> $head1..$head2 again and again with the corresponding verbosity in test
>>> setup and shelling out to "git rev-parse --short HEAD" or whatever.
>>
>> I find the converse (we are testing the formatting over and over again)
>> less convincing. In these tests, we really are checking for $head2 in
>> the stderr to verify that the correct thing was fetched. I'm not
>> convinced that we should be relying on _other_ submodule tests to tell
>> us that submodule fetch is broken. Which brings me back to the original
>> motivation of this patch..
>>
>>>
>>> That's perfectly fine here, since the actual point of the test_cmp is to
>>> check the formatting/order etc. of the output itself, not to continually
>>> re-assert that submodule updating still works, and that we get the right
>>> OIDs.
>>
>> which is that these tests actually are continually re-asserting the
>> submodule updating works correctly in the different circumstances, and
>> since we use the stderr to check this, test_cmp adds unwarranted noise.
>>
>> But you are correct in that the point of test_cmp is to check
>> formatting/order etc. There is value in using test_cmp for this purpose,
>> and getting rid of it entirely creates a hole in our test coverage.
>> (This wouldn't mean that we'd need to use test_cmp _everywhere_ though,
>> only that we need to use test_cmp _somewhere_.)
>>
>> As it stands:
>>
>> +   test_cmp can improve the readability of the test helpers and
>>     debuggability of tests vs grep
>> +/- test_cmp can catch formatting changes that are hard to catch
>>     otherwise, though at the cost of being sensitive to _any_ formatting
>>     changes
>> -   test_cmp needs some munging to eliminate unnecessary information
>>
>> so on the whole, I think it's worth trying to use test_cmp in the test
>> helper. We may not _need_ it everywhere, but if it would be nice to use
>> it in as many places as possible.
>
> I think whatever you opt to go for here makes sense. I just wanted to
> provide the feedback in case it was helpful, i.e. when reading it I
> found these conversions a bit odd, wondered if they were strictly needed
> etc.
>
> Thanks!

The additional perspective was indeed helpful, thanks!