test suite speedups via some not-so-crazy ideas (was: scripting speedups[...])

Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> · Wed, 03 Nov 2021 10:24:57 +0100

On Sat, Oct 30 2021, Ævar Arnfjörð Bjarmason wrote:

> On Tue, Oct 26 2021, Eric Wong wrote:
>
>> Johannes Schindelin <Johannes.Schindelin@xxxxxx> wrote:
>>> * Test suite is slow. Shell scripts and process forking.
>>> 
>>>    * What if we had a special shell that interpreted the commands in a
>>>      single process?
>>> 
>>>    * Even Git commands like rev-parse and hash-object, as long as that’s
>>>      not the command you’re trying to test
>>
>> This is something I've wanted in a very long time as a scripter.
>> fast-import has been great over the years, as is
>> "cat-file --batch(-check)", but there's gaps should be filled
>> (preferably without fragile linkage of shared libraries into a
>> script process)
>>
>>>    * Dscho wants to slip in a C-based solution
>>> 
>>>    * Jonathan tan commented: going back to your custom shell for tests
>>>      idea, one thing we could do is have a custom command that generates
>>>      the repo commits that we want (and that saves process spawns and
>>>      might make the tests simpler too)
>>
>> Perhaps a not-seriously-proposed patch from 2006 could be
>> modernized for our now-libified internals:
>
> I think something very short of a "C-based solution" could give us most
> of the wins here. Johannes was probably thinking of the scripting being
> slow on Windows aspect of it.
>
> But the main benefit of hypothetical C-based testing is that you can
> connect it to the dependency tree we have in the Makefile, and only
> re-run tests for code you needed to re-compile.
>
> So e.g. we don't need to run tests that invoke "git tag" if the
> dependency graph of builtin/tag.c didn't change.
>
> With COMPUTE_HEADER_DEPENDENCIES we've got access to that dependency
> information for our C code.
>
> With trace2 we could record an initial test run, and know which built-in
> commands are executed by which tests (even down to the sub-test level).
>
> Connecting these two means that we can find all tests that say run "git
> fsck", and if builtin/fsck.c is the only thing that changed in an
> interactive rebase, that's the only tests we need to run.
>
> Of course changes to things like cache.h or t/test-lib.sh would spoil
> that cache entirely, but pretty much the same is true for re-compiling
> things now, so would changing say builtin/init-db.c, as almost every
> test does a "git init" somewhere.
>
> But I think that approch is viable, and should take us from a huge
> hypothetical project like "rewrite all the tests in C" to something
> that's a viable weekend hacking project for someone who's interested.

First to outline some goals: I think saying we'd like to speed up
scripts is really getting into the weeds.

Surely we'd like to speed up test runs, and generally speaking our test
suite can be parallelized, and it mostly doesn't matter if it runs on
your computer or other people's computers, as long as it runs your
code. So:

 1. Even for contributors that have a slow system they could benefit from
    the hosted CI (on GitHub or wherever else) being faster.

 2. Our CI takes around 30-60m to finish.

 3. That CI time is almost entirely something that could be sped up by
    throwing hardware at it.

 4. We're currently using "Dv2 and DSv2-series" hosted runners
    (https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners)
    we have quite a few people on-list who work for the
    company/companies involved.

    Is it within the realm of possibility to get more CI resources
    assigned to git/git's organization network?

 5. Or, is there willingness to host/pay for hosted runners from
    someone?

    Not wearing PLC hat I'd think that we could speed that up a lot with
    some reasonable money spending, and if pushing to CI made CI run in
    3-5m instead of 60m that would be worthwhile.

 6. Related to #5: I've been able to setup hosted runner jobs, and
    self-hosted runner jobs, but is there a way to do some opportunistic
    mixture of the two? Even one where self-hosted runners could come
    and go, and if they're present contribute resources to git/git's
    network?

 7. We run the various GIT_TEST_* etc. jobs in sequence, is there a
    reason for why we're serializing things in GitHub CI that could be
    parallelized?

    The vs-build and vs-test tests run in parallel, any reason we're not
    doing that trick on the ubuntu runners other than "nobody got to
    it?". We seem to be trying hard to do the exact opposite there..

    At the extreme end we could build git ~once, and have N tests depend
    on that, where N ~= $(ls t/*.sh) x $number_of_test_modes). But
    perhaps runner starting overhead starts to be the limiting factor at
    some point.

 8. To a first approximation, does anyone really care about getting an
    exhaustive list of all failures in a run, or just that we have *a*
    failure? You can always do an exhaustive run later.

 9. On the "no" answer to #8: When I build/test my own git I first run
    those tests that I modified in the relevant branches, and if any of
    those fail I just stop.

    I generally don't need to run the entirety of the rest of the test
    suite to stop and investigate why I have a failure.

    Perhaps our CI could use a similar trick, i.e. first test the set of
    modified test files, and perhaps with some ad-hoc matching of
    filenames, so e.g. if you modify builtin/add.c we'd run t/*add*.sh
    in the first set, and all with --immediate per #8 above.

    If we pass that we'd run the full set, minus that initial set.