Re: [PATCH] for-each-repo: do nothing on empty config

Derrick Stolee <stolee@xxxxxxxxx> · Wed, 6 Jan 2021 06:54:16 -0500

On 1/5/2021 11:20 PM, Eric Sunshine wrote:
> On Tue, Jan 5, 2021 at 9:20 PM Derrick Stolee <stolee@xxxxxxxxx> wrote:
>> On 1/5/2021 12:55 PM, Eric Sunshine wrote:
>>> On Tue, Jan 5, 2021 at 9:44 AM Derrick Stolee via GitGitGadget
>>> <gitgitgadget@xxxxxxxxx> wrote:
>>> Probably not worth a re-roll, but the above has higher cognitive load than:
>>>
>>>     if (!value)
>>>         return 0;
>>>
>>> which indicates clearly that the command succeeded, whereas `return
>>> result` makes the reader scan all the code above the conditional to
>>> figure out what values `result` could possibly hold.
>>
>> True. Looking at this again, it might be better to just update the
>> loop to be
>>
>>         for (i = 0; values && i < values->nr; i++) {
>>
>> which would run the loop zero times when there are zero results.
> 
> I find the explicit `return 0` easier to reason about since I can see
> immediately that it handles a particular boundary condition, after
> which I no longer have to think about that condition as I continue
> reading the code.
> 
> That said, I don't think it matters greatly one way or the other
> whether you use `return result`, `return 0`, or adjust the loop
> condition. It might matter if more functionality is added later, but
> we don't have to worry about it now, and it's not worth re-rolling
> just for this, or spending a lot of time discussing it.

Junio made a good point that 'values' doesn't change during the loop,
so I shouldn't use it in the sentinel. I'll use your "return 0;"

>>>> +       git for-each-repo --config=bogus.config -- these args would fail
>>>
>>> The `these args would fail` arguments add mystery to the test,
>>> prompting the reader to wonder what exactly is going on: "Fail how?",
>>> "Is it supposed to fail?". It might be less confusing to use something
>>> more direct like `nonexistent` or `does not exist`.
>>
>> I guess the point is that if we actually did try running a subcommand
>> on a repo (for instance, accidentally running it on the current repo)
>> then the command would fail when running "git these args would fail".
>>
>> To me, "nonexistent" or "does not exist" doesn't adequately describe
>> this (hypothetical) situation.
>>
>> Perhaps "fail-subcommand" might be better?
> 
> I had never even looked at git-for-each-repo before, so I took the
> time to read the documentation and the source code before writing this
> reply. Now that I understand what is supposed to happen, I might be
> tempted to suggest `this-command-wont-be-run` as an argument, but
> that's a mouthful. If you want to be really explicit that it should
> fail if a bug gets re-introduced which causes the argument to be run
> even though the config is empty, then perhaps use `false`:
> 
>     git for-each-repo --config=bogus.config -- false

I'm just going to repeat that this would run "git false". It achieves
the same goal where we interpret this as a failure if the subcommand
is run.

> By the way, when reading the documentation, some questions came to mind.
> 
> Is the behavior implemented by this patch desirable? That is, if the
> user mistypes the name of the configuration variable, would it be
> better to let the user know about the problem by returning an error
> code rather than succeeding silently? Or do you foresee people
> intentionally running the command against an empty config variable?
> That might be legitimate in automation situations. If legitimate to
> have an empty config, then would it be helpful to somehow distinguish
> between an empty config variable and a non-existent one (assuming we
> can even do that)?

As mentioned in the message, this is the situation the background
maintenance would see if a user used 'git maintenance unregister' on
all of their repos or removed the 'maintenance.repo' config values
from their global config. I think it would be better to not start
failing the background commands.

Even outside of that context, we have no way to specify an "existing
but empty" multi-valued config value over a non-existing config
value. I'd rather prefer the interpretation "I succeeded in doing
nothing" instead of "I think you made a mistake, because there's
nothing to do."

Could we meet in the middle and print a warning to stderr?

	warning(_("supplied config key '%s' has no values"));

That would present a useful message to users running this command
manually (or constructing automation) without breaking scripts
that parse the output.

> Is the API of this command ideal? It feels odd to force the user to
> specify required input via a command-line option rather than just as a
> positional argument. In other words, since the config variable name is
> mandatory, an alternative invocation format would be:
> 
>     git for-each-repo <config-var> <cmd>
> 
> Or do you foresee future enhancements expanding the ways in which the
> repo list can be specified (such as from a file or stdin or directly
> via the command-line), in which case the `--config` option makes
> sense.

I believe some voice interest in specifying a list of repositories
by providing a directory instead of a config value. That would
require scanning the immediate subdirectories to see if they are Git
repos.

A --stdin option would also be a good addition, if desired.

Since this command was intended for automation, not end-users, it
seemed that future-proofing the arguments in this way would be a good
idea. We want to remain flexible for future additions without
breaking compatibility.

Thanks,
-Stolee