Re: [PATCH 3/6] t6601: add helper for testing path-walk API

Jonathan Tan <jonathantanmy@xxxxxxxxxx> · Mon, 4 Nov 2024 15:39:12 -0800

Derrick Stolee <stolee@xxxxxxxxx> writes:
> You are correct that if the path-walk API emitted multiple batches
> with the same path name, then we would not detect that via the current
> testing strategy.
> 
> The main reason to use the sort is to avoid adding a restriction on
> the order in which objects appear within the batch.
> 
> Your recommendation to group a batch into a single line does not
> strike me as a suitable approach, because long lines become hard to
> read and difficult to parse diffs. (Also, the order within the batch
> becomes baked in as a requirement.)

The hashes in a line can be abbreviated if line length is a concern.
Also, note that I am suggesting sorting the OIDs within a line (that is,
a batch), and also sorting the lines (batches) as a whole.

> The biggest question I'd like to ask is this: do you see a risk of
> a path being repeated? There are cases where it will happen, such as
> indexed objects that are not reachable anywhere else.

I was thinking that the whole point of this feature is that we group
objects by path, so it seems desirable to test that paths are not
repeated. (Or repeated as little as possible, if it is not possible
to avoid repetition e.g. in the case you describe.)

> The way I would consider modifying these tests to reflect the batching
> would be to associate each batch with a number, causing the order of
> the paths to become hard-coded in the test. Something like
> 
>    0:COMMIT::$(git rev-parse ...)
>    0:COMMIT::$(git rev-parse ...)
>    1:TREE::$(git rev-parse ...)
>    1:TREE::$(git rev-parse ...)
>    2:TREE:right/:$(git rev-parse ...)
>    3:BLOB:right/a:$(...)
>    4:TREE:left/:$(git rev-parse ...)
>    5:BLOB:left/b:$(...)
> 
> This would imply some amount of order that maybe should become a
> requirement of the API.
> 
> Thanks,
> -Stolee

If we're willing to declare an order in which we will return paths to
the user, that would work too. (I'm not sure that we need to declare an
order, though.)