Re: [PATCH v2 01/18] maintenance: create basic maintenance runner

Derrick Stolee <stolee@xxxxxxxxx> · Thu, 30 Jul 2020 09:12:56 -0400

On 7/29/2020 6:19 PM, Jonathan Nieder wrote:
> Hi,
> 
> Derrick Stolee wrote:
> 
>> [Subject: maintenance: create basic maintenance runner]
> 
> Seems sensible, and a good way to set up for the later patches.  Let's
> take a look at how it does that.
> 
> [...]
>> --- /dev/null
>> +++ b/Documentation/git-maintenance.txt
>> @@ -0,0 +1,57 @@
> [...]
>> +SUBCOMMANDS
>> +-----------
>> +
>> +run::
>> +	Run one or more maintenance tasks.
> 
> [jrnieder] How do I supply the tasks on the command line?  Are they
> parameters to this subcommand?  If so, it could make sense for this to
> say something like
> 
> 	run <task>...::

Hopefully this is documented to your satisfaction when the ability
to customize the tasks is implemented.

> What is the exit code convention for "git maintenance run"?  (Many Git
> commands don't document this, but since we're making a new command
> it seems worth building the habit.)

Is this worth doing? Do we really want every command to document
that "0 means success, everything else means failure, and some of
those exit codes mean a specific kind of failure that is global to
Git"?

> [...]
>> --- a/builtin/gc.c
>> +++ b/builtin/gc.c
>> @@ -699,3 +699,62 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
> [...]
>> +static struct maintenance_opts {
>> +	int auto_flag;
>> +} opts;
> 
> Packing this in a struct feels a bit unusual.  Is the struct going to
> be passed somewhere, or could these be individual locals in
> cmd_maintenance?

This will grow, and I'd rather have one global struct than many
individual global items. It makes it clearer when I use
"opts.auto_flag" that this corresponds to whether "--auto" was
provided as a command-line option.

> [...]
>> +
>> +static int maintenance_task_gc(void)
>> +{
>> +	int result;
>> +	struct argv_array cmd = ARGV_ARRAY_INIT;
>> +
>> +	argv_array_pushl(&cmd, "gc", NULL);
>> +
>> +	if (opts.auto_flag)
>> +		argv_array_pushl(&cmd, "--auto", NULL);
> 
> These are both pushing single strings, so they can use argv_array_push.

Thanks. I noticed a few of these myself. Luckily, I'll be going
through all of these invocations and replacing them with new
methods soon ;)

[1] https://lore.kernel.org/git/30933a71-3130-5478-cbfd-0ca5bb308cf2@xxxxxxxxx/

> [...]
>> --- /dev/null
>> +++ b/t/t7900-maintenance.sh
>> @@ -0,0 +1,22 @@
>> +#!/bin/sh
>> +
>> +test_description='git maintenance builtin'
>> +
>> +GIT_TEST_COMMIT_GRAPH=0
>> +GIT_TEST_MULTI_PACK_INDEX=0
> 
> Why does this disable commit graph and multipack index?  Is that setting
> up for something that comes later?

Yes, these don't need to be here yet.

> [...]
>> +test_expect_success 'gc [--auto]' '
>> +	GIT_TRACE2_EVENT="$(pwd)/run-no-auto.txt" git maintenance run &&
>> +	GIT_TRACE2_EVENT="$(pwd)/run-auto.txt" git maintenance run --auto &&
>> +	grep ",\"gc\"]" run-no-auto.txt  &&
>> +	grep ",\"gc\",\"--auto\"]" run-auto.txt
> 
> This feels a little odd in two ways:
> 
> - the use of "git gc" isn't a user-facing detail, so this is testing
>   implementation instead of the user-facing behavior.  That's okay,
>   though --- it can be useful to test internals sometimes.

Consider this a "unit test" of the builtin. I'm testing whether the
command-line arguments had an effect on the child process.

> - the way that this tests for "git gc" feels brittle: if the trace
>   emitter changes some day to include a space after the comma, for
>   example, then the test would start failing.  I thought that in the
>   spirit of fakes, it could make sense to write a custom "git gc"
>   command using test_write_script, but that isn't likely to work
>   either since gc is a builtin.
> 
> Perhaps this is suggesting that we need some central test helper for
> parsing traces so we can do this reliably in one place.  What do you
> think?

I'm specifically using GIT_TRACE2_EVENT, which is intended for
consumption by computer, not humans. Adding whitespace or otherwise
changing the output format would be an unnecessary disruption that
is more likely to have downstream effects with external tools.

In some way, adding extra dependencies like this in our own test
suite can help ensure the output format is more stable.

If there is a better way to ask "Did my command call 'git gc' (with
no arguments|with these arguments)?" then I'm happy to consider it.

Thanks,
-Stolee