Re: [PATCH v4 0/5] config: introduce discovery.bare and protected config

Junio C Hamano <gitster@xxxxxxxxx> · Thu, 23 Jun 2022 11:32:27 -0700

Glen Choo <chooglen@xxxxxxxxxx> writes:

> Jonathan Tan <jonathantanmy@xxxxxxxxxx> writes:
>
>> "Glen Choo via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes:
>>> Glen Choo (5):
>>>   Documentation/git-config.txt: add SCOPES section
>>>   Documentation: define protected configuration
>>
>> Forgot to mention when I was sending my comments on patch 2: we should
>> standardize on "protected config" and not use "protected configuration"
>> anywhere.
>
> Makes sense.

Using a single word consistently does make sense, but why favor a
non-word over a proper word ;-)?

> I suppose that the idea behind this is that we only parse and store each
> config file exactly once. It's a good goal, but the whole point of the
> configset is that we can query a single struct to figure out the value
> of a config variable. Having multiple configsets starts to shift more of
> the burden to the callers because they now have to query multiple
> configsets to find their desired config value, and we already start to
> see some of this unpleasantness in this series.

Yes, I was worried about this, too.  "parse and store exactly once"
may merely be a performance thing, but it still matters, even though
it is not worse than making duplicate callbacks to overwrite globals
that have been already set earlier, which will affect correctness ;-)

> An alternative that I'd been thinking about is to make a few changes to
> the git_config_* + configset API to allow us to use a single configset
> for all of our needs:
>
> 1. Keep track of what config we've read when reading into
>    the_repository->config, i.e. instead of a boolean "all config has
>    been [un]read", we can express "system and global config has been
>    read, but not local or command config". Then, use this information to
>    load config from sources as they become available. This will allow us
>    to read incomplete config for trace2 and setup.c (discovery.bare and
>    safe.directory), and only read what we need later on.

That is not a bad direction to go, but are we sure that we always
read in the right order (and there is one single right order) and
stop at the right step?

config.c::do_git_config_sequence() reads the system and then the
global before the local, the worktree, and the command line.  We
would allow the values of "protected" configuration variables to be
inspected by stopping after the first two and inspecting the result
before the local and the rest overrides them, but will we need
*only* that kind of partial configuration reading that stops exactly
there?  Even with the proposed "protected" scheme, I thought we plan
to honor the command line ones, so we may need to read
system+global+command without reading anything else to grab the
values only from the protected sources (ah, I like the application
of the adjective "protected" to the source, not variables, because
that is what we are really talking about---alternatively we could
call it "safe").  But if we later read local and worktree ones
lazily, unless we _insert_ them before what we read from the command
line, we'll break the last-one-wins property, so we need to be
careful.  I guess each configuration value in the configset knows
where it came from, so it probably is possible to insert the ones
you read lazily later in the right spot.

> 2. Add an additional argument that specifies what scopes to respect when
>    reading config (maybe as a set of flags). This gives us extra
>    specificity when using the git_config*() functions, so we could get
>    rid of git_protected_config() like so:
>
>     /* Change enum config_scope into flags first... */
>
>     #define WIP_SCOPES_PROTECTED = CONFIG_SCOPE_SYSTEM & \
>       CONFIG_SCOPE_GLOBAL & CONFIG_SCOPE_COMMAND
>
>     static enum discovery_bare_allowed get_discovery_bare(void)
>     {
>       enum discovery_bare_allowed result = DISCOVERY_BARE_ALWAYS;
>       git_config(discovery_bare_cb, &result, WIP_SCOPES_PROTECTED);
>       return result;
>     }

Alternatively, we could make the callback aware of the scope for
each var-value it is called and have it filter, but that would be a
bigger surgery.

I think a new iterator git_config_in_scope(), instead of updating
git_config(), would make sense.  By definition, all existing
git_config() callers do not need the scope specifiers, and
"protected" may be the first one but will not be the last one that
needs to read from particular scopes.