Hi Junio, On Tue, 26 Jun 2018, Junio C Hamano wrote: > Christian Couder <christian.couder@xxxxxxxxx> writes: > > > Obviousness is often not the same for everybody. > > ... which you just learned---what you thought obvious turns out to > be not so obvious after all, so you adjust to help your readers. Indeed. And trying to tell the reader that they should find it obvious is not exactly productive. It only causes bad feelings and can be easily avoided. > >> In this particular case it even feels as if this test is not even > >> testing what it should test at all: > >> > >> - it should verify that all of the commits in the first parent > >> lineage are part of the list > > > > It does that. > > > >> - it should verify that none of the other commits are in the list > > > > It does that too. > > But the point is it does a lot more by insisting exact output. Indeed. One thing it does in addition is to make the test code a lot less obvious for readers in general. Let me summarize again what good regression tests have to deliver, because I think it cannot be stressed enough, especially in this context where we run the danger of adding poor regression tests: - a regression test needs to catch regressions (d'oh) - a regression test needs to be *quick*. Otherwise developers will skip running them, which is worse than having no tests at all (because the effort to develop them is wasted) - a regression test must make it as easy as possible to fix regressions There are quite a few corollaries to that last point, some of which are: - a regression test that fails for anything but an obvious reason is *useless* - a regression test that tries to test for *everything* (including dogs and cats) not only breaks the quickness requirement, but it also makes it confusing where to start fixing things. "The test talked about --bisect but now I am stuck somewhere in XYZ.c, what has *that* to do with --bisect?" is *not* something you want to risk any developer yelling at the test code that you authored - less is more. If you can use commits that were already generated in the common "setup" step, there is literally a negative value in generating a new set of commits instead - less is more. If you can catch the same regressions in three, concise *and* understandable lines, avoid using thirty lines instead - regression tests should not need to be adjusted when the logic changes in an intended way. It is a strong sign that a regression test was written badly if it starts failing for any reason other than a regression The overzealous "I want this output to be exactly this" stance of the tests we are discussing here is a very obvious violation here. Regression tests should fail only to indicate regressions. Really. Let me repeat that because it is so obvious when you think about it, but it is easy to forget when writing regression tests: Regression tests should fail only to indicate regressions. > For example, the version I reviewed had a two "expected output", and > said that the actual output must match either one of them. I guess it > was because there were two entries with the same distance and we cannot > rely on which order they appear in the result? If a test history gained > another entry with the same distance, then would we need 6 possible > expected output because we cannot rely on the order in which these three > come out? It is totally unnecessary to go there, as it would make those regression tests a lot less valuable than they could otherwise be. Let me elaborate further below. > That was the only thing I was complaining about. Dscho gave me too > much credit and read a lot more good things than what I actually > meant to say ;-). Why don't you just accept my praise gracefully? ;-) It's not that I gave you a lot of praise recently, even if you clearly deserve it. > >> And that is really all there is to test. > > Another is that "rev-list --bisect-all" promises that the entries > are ordered by the distance value. Yes! And you know what we can do there? We can test *precisely* that! # verify that the output is sorted by `dist` (descending) sed "s/.*dist=\([0-9]*\).*/\1/" <revs >dist && sort -n -r <dist >dist.sorted && test_cmp dist dist.sorted This extracts the distance numbers into their own file, then verifies that they were already sorted. The really big advantage here is that any future change that might result in a different order of entries with the same "dist" value *will not cause this regression test to fail*. And for a good reason: because ordering identical "dist" values differently is not a regression at all. > So taking the above three points, perhaps > > cat >expect <<EOF && > ... as written in one of the expect list in Tiago's patch > EOF Please no. Please let's *not* generate more commits when we already have a perfectly fine set of commits generated by the setup phase of the very same script. Please let's not use confusing names for those commits. Please let's not add a diagram whos layout deviates for no good reason from the layout used by the existing diagram in the same file. > # Make sure we have the same entries, nothing more, nothing less > git rev-list --bisect-all $other_args >actual && > sort actual >actual.sorted && > sort expect >expect.sorted && > test_cmp expect.sorted actual.sorted With plenty experience in investigating test failures of Git's test suite under my belt, let me tell you how much faster it is for me to start debugging when reading this code instead: git rev-list --bisect-all --first-parent F..E >revs && # only E, e1..e8 should be listed, nothing else test_line_count = 9 revs && for rev in E e1 e2 e3 e4 e5 e6 e7 e8 do grep "^$(git rev-parse $rev) " revs || return done I am faster by... a lot. Like, seconds instead of minutes. As a bonus, it is also shorter. Quicker to read. Easier to grok. Quicker to validate manually. > # Make sure the entries are sorted in the dist order > sed -e 's/.*(dist=\([1-9]*[0-9]\)).*/\1/' actual >actual.dists && > sort actual.dists >actual.dists.sorted && > test_cmp actual.dists.sorted actual.dists > > is what I would have expected. ... but without the [1-9] because the distance can be 0. And with a star after the [0-9] instead. And then you have essentially the very same thing I suggested above. A readable regression test, that only triggers failures upon the regression we want to prevent, is fast to grok, actionable, and leads to fast debugging. Just to make sure: I am utterly disinterested in pushing "my" code through. I am very much interested in having a good patch with a good regression test and a good commit message. Otherwise I would not put up with what it takes to review it. So Chris, if you feel that you must "win" against me, just take Junio's code already. In that case, however, do change his code to avoid generating those unnecessary commits. Thanks, Dscho