Re: [RFC PATCH v5] Implement --first-parent for git rev-list --bisect

Johannes Schindelin <Johannes.Schindelin@xxxxxx> · Fri, 29 Jun 2018 13:20:58 +0200 (DST)

Hi Junio,

On Thu, 28 Jun 2018, Junio C Hamano wrote:

> What I meant by "many separte grep calls" was to contrast these two
> approaches:
> 
>  * Have one typical output spelled out as "expect", take an output
>    from an actual run into "actual", make them comparable and then
>    do a compare (which does not use grep; it uses sort in the
>    "making comparable" phase)
> 
>  * Not have any typical output, take an output from an actual run,
>    and have _code_ that inspects the output encode the rule over the
>    output (e.g. "these must exist" is inspected with many grep
>    invocations)
> 
> Two things the "output must have 9 entries, and these 9 must be
> mentioned" we see at the beginning of this message does not verify
> are (1) exact dist value given to each of these entries and (2) the
> order in which these entries appear in the output.  The latter is
> something we document, and the test should cover.  The former does
> not have to be cast in stone (i.e. I do not think it does not make a
> difference to label the edge commits with dist=1 or dist=0 as long
> as everything is consistent), but if there is no strong reason to
> keep it possible for us to later change how the numbers are assigned,
> I am OK if the test cast it in stone.
> 
> Another reason (other than "many separate invocation of grep") to
> favor the former approach (i.e. use real-looking expected output,
> munge it and the real output into comparable forms and then compare)
> is that it is easier to see what aspect of the output we care (and
> we do not care) about.
> 
> It is harder to see the omission of exact dist value and ordering of
> entries in the outpu in the latter approach, and more importantly,
> know if the omission was deliberate (e.g. it was merely an example)
> or a mere mistake.
> 
> With "using a real-looking expected output, make it and the actual
> output comparable and then compare" approach, the aspects in the
> output we choose not to care about will show in the "make them
> comparable" munging.  If we do not care the exact dist values, there
> would be something like s/dist=[0-9]*/dist=X/ to mask the exact
> value before making the two comparable to see that the expect and
> the actual files have the same entries.  If we still do care about
> the entries are ordered by the dist values, there would be something
> that sorts the entries with the actual dist values before doing that
> masking to ensure if the order is correct.

The problem here is of course that you *cannot* set the exact output
in stone, because of sorting instabilities.

So you have to play tricks to sort (twice, with different keys) the
expected output and the actual output, to verify that all the expected
commits are listed (and only those) and to verify that they are ordered by
the distance, in descending order.

And this trick, while it still makes the test correct and stable and yadda
yadda, *also* makes this trick a lot less readable than my version. And
therefore makes it more difficult to verify that your test actually does
what it is supposed to do.

Ciao,
Dscho