Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> writes: > On Wed, Feb 16 2022, Glen Choo wrote: > >> In a previous commit, we replaced test_cmp invocations with >> verify_fetch_result(). Finish the process of removing test_cmp by using >> grep in verify_fetch_result() instead. >> >> This makes the tests less sensitive to changes because, instead of >> checking the whole stderr, we only grep for the lines of the form >> >> * "<old-head>..<new-head>\s+branch\s+-> origin/branch" >> * "Fetching submodule <submodule-path>" (if fetching a submodule) >> >> when we expect the repo to have fetched. If we expect the repo to not >> have fetched, grep to make sure the lines are absent. Also, simplify the >> assertions by using grep patterns that match only the relevant pieces of >> information, e.g. <old-head> is irrelevant because we only want to know >> if the fetch was performed, so we don't need to know where the branch >> was before the fetch. > > I tried ejecting 1/9 and 2/9 out of this series locally, and it passes > all tests until the new tests you add in 7/9. > > As ugly as some of the pre-image is, I wonder if dropping these first > two and biting the bullet and just continuing with the test_cmp is > better. > > The test_cmp is going to catch issues that even the cleverest grep > combinations won't, e.g. in the submodule-in-C series I discovered a bug > where all of our testing & the existing series hadn't spotted that we > were dropping a \n at the end in one of the messages. I think there are two schools of thought on how to test informational messages: - assert an exact match on the exact output that we generate - assert that the output contains the pieces of information we care about These two approaches are virtually opposites on two axes - the former will catch unintentional changes (like you've noted) and the latter saves on maintenance effort and tends to be less noisy in tests. Personally, I'm a bit torn between both approaches in general because I want tests to be maintainable (testing the exact output is a bit of an antipattern at Google), but I'm not very comfortable with the fact that unintended changes can sneak through. So I don't think there's a correct answer in general. Maybe an acceptable rule of thumb is that test_cmp is good until it starts getting in the way of reading and writing understandable tests. If we agree on that rule, then for this patch, I think replacing test_cmp is the way to go, primarily because it lets us ignore the 'old head' of the branch before the fetch, e.g. in the quoted example.. >> test_expect_success setup ' >> @@ -274,13 +277,10 @@ test_expect_success "Recursion doesn't happen when no new commits are fetched in >> ' >> >> test_expect_success "Recursion stops when no new submodule commits are fetched" ' >> - head1=$(git rev-parse --short HEAD) && >> git add submodule && >> git commit -m "new submodule" && >> - head2=$(git rev-parse --short HEAD) && >> - echo "From $pwd/." > expect.err.super && >> - echo " $head1..$head2 super -> origin/super" >>expect.err.super && > > ...as opposed to if we just rolled the generation of this into some > utility printf function. we'd still have to deal with $head1 if we use test_cmp. That's fine for this test, because it's pretty simple, but it gets pretty janky later on: @@ -345,20 +339,13 @@ test_expect_success "Recursion picks up all submodules when necessary" ' git fetch && git checkout -q FETCH_HEAD ) && - head1=$(git rev-parse --short HEAD^) && git add subdir/deepsubmodule && git commit -m "new deepsubmodule" && - head2=$(git rev-parse --short HEAD) && - echo "Fetching submodule submodule" > ../expect.err.sub && - echo "From $pwd/submodule" >> ../expect.err.sub && - echo " $head1..$head2 sub -> origin/sub" >> ../expect.err.sub + git rev-parse --short HEAD >../subhead ) && - head1=$(git rev-parse --short HEAD) && git add submodule && git commit -m "new submodule" && - head2=$(git rev-parse --short HEAD) && - echo "From $pwd/." > expect.err.super && - echo " $head1..$head2 super -> origin/super" >> expect.err.super && + git rev-parse --short HEAD >superhead && ( cd downstream && git fetch >../actual.out 2>../actual.err In this example, we have two $head1 variables in different subshells, one of which is HEAD, but the other is HEAD^. The reason why we want HEAD^ isn't obvious (IIRC it's because the submodule upstream is 2 commits ahead because we add the deepsubmodule in a separate commit), in my opinion, and I got tripped up quite a few times trying to read and understand the test. That's a lot of effort to spend on irrelevant information - the test actually cares about what it fetched, not where the ref used to be. So for that reason, I'd prefer to remove test_cmp for this test.