Re: [RFC PATCH] t/Makefile: use dependency graph for "check-chainlint"

Eric Sunshine <sunshine@xxxxxxxxxxxxxx> · Tue, 14 Dec 2021 02:44:25 -0500

On Mon, Dec 13, 2021 at 5:09 AM Ævar Arnfjörð Bjarmason
<avarab@xxxxxxxxx> wrote:
> On Mon, Dec 13 2021, Eric Sunshine wrote:
> > Rather than running `chainlint` and `diff` once per self-test -- which
> > may become expensive as more tests are added -- instead run `chainlint`
> > a single time over all tests bodies collectively and compare the result
> > to the collective "expected" output.
>
> I think that "optimizing" things like this is an anti-pattern. I.e. we
> have N chainlint test files, and N potential outputs from that (ok or
> not, and with/without error). If one of the chainlint tests changes
> we'd like to re-run it, if not we can re-use an earlier run.

As mentioned in a reply elsewhere, the commit message sells this as an
optimization, but that's not the real reason for the change, which is
that the rewritten `check-chainlint` target for the upcoming new
chainlint really wants to have a composite file of the "test" input
and a composite of the "expect" output. I didn't know how to sell that
change in this preparatory patch series, so I weakly framed it as an
optimization. The reason for making this a preparatory step is that it
makes for a less noisy patch later on when the new chainlint is
actually plugged into the `check-chainlint` target. At least, it was
less noisy originally... in the final implementation, I think it ends
up being noisy anyhow. So, maybe it makes sense to drop this patch
altogether(?).

> This is something make's dependency logic is perfectly suited for, and
> will be faster than any optimization of turning a for-loop into a
> "sed" command we run every time, since we'll only need to "stat" a few
> things to see that there's nothing to do.
>
> +BUILT_CHAINLINTTESTS = $(patsubst %,.build/%.actual,$(CHAINLINTTESTS))
> +
> +.build/chainlint:
> +       mkdir -p $@
> +
> +$(BUILT_CHAINLINTTESTS): | .build/chainlint
> +$(BUILT_CHAINLINTTESTS): .build/%.actual: %
> +       $(CHAINLINT) <$< | \
> +       sed -e '/^# LINT: /d' >$@ && \
> +       diff -u $(basename $<).expect $@
> +
> +check-chainlint: $(BUILT_CHAINLINTTESTS)

This sort of optimization makes sense (I think) even with the new
chainlint preferring to see composite "test" and "expect" files rather
than the individual files. The individual files would be prerequisites
of the composite files, thus the composites would only be regenerated
if the individual files change. And the composite files would be
prerequisites of the `check-chainlint` target, so it would only run if
the composite files change (or if chainlint itself changes).

In fact, with the new chainlint checking all tests in all scripts at
once, this technique should apply nicely to it, as well, since the
names of test scripts (t????-*.sh) are fed to it as command-line
arguments. Thus, the t????-*.sh files could be prerequisites of the
chainlint rule which would use $? to check only test scripts which
have changed since the previous run.

Having said all that, I don't think think the changes in this series
or the upcoming new chainlint series make the situation any worse
(Makefile-wise) than its current state, and the sort of optimizations
discussed here can easily be made after those series land. (And, as my
Git time is rather limited these days, I'd really like to focus on
getting the core components landed.)