Junio C Hamano <gitster@xxxxxxxxx> writes: > Glen Choo <chooglen@xxxxxxxxxx> writes: > >> Jonathan Tan <jonathantanmy@xxxxxxxxxx> writes: >> >>> "Glen Choo via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes: >>>> Glen Choo (5): >>>> Documentation/git-config.txt: add SCOPES section >>>> Documentation: define protected configuration >>> >>> Forgot to mention when I was sending my comments on patch 2: we should >>> standardize on "protected config" and not use "protected configuration" >>> anywhere. >> >> Makes sense. > > Using a single word consistently does make sense, but why favor a > non-word over a proper word ;-)? Hm, I guess there's an argument that "config" is a term of art that specifically refers to things from "git config". From that lens, it's much less confusing to see the CONFIGURATION section in Documentation/git-config.txt. But the argument is a little flimsy because I don't think that's something we've stuck to anywhere. I'll use "configuration" if it's not too unwieldy. >> I suppose that the idea behind this is that we only parse and store each >> config file exactly once. It's a good goal, but the whole point of the >> configset is that we can query a single struct to figure out the value >> of a config variable. Having multiple configsets starts to shift more of >> the burden to the callers because they now have to query multiple >> configsets to find their desired config value, and we already start to >> see some of this unpleasantness in this series. > > Yes, I was worried about this, too. "parse and store exactly once" > may merely be a performance thing, but it still matters, even though > it is not worse than making duplicate callbacks to overwrite globals > that have been already set earlier, which will affect correctness ;-) Exactly. >> An alternative that I'd been thinking about is to make a few changes to >> the git_config_* + configset API to allow us to use a single configset >> for all of our needs: >> >> 1. Keep track of what config we've read when reading into >> the_repository->config, i.e. instead of a boolean "all config has >> been [un]read", we can express "system and global config has been >> read, but not local or command config". Then, use this information to >> load config from sources as they become available. This will allow us >> to read incomplete config for trace2 and setup.c (discovery.bare and >> safe.directory), and only read what we need later on. > > That is not a bad direction to go, but are we sure that we always > read in the right order (and there is one single right order) and > stop at the right step? > > config.c::do_git_config_sequence() reads the system and then the > global before the local, the worktree, and the command line. We > would allow the values of "protected" configuration variables to be > inspected by stopping after the first two and inspecting the result > before the local and the rest overrides them, but will we need > *only* that kind of partial configuration reading that stops exactly > there? Even with the proposed "protected" scheme, I thought we plan > to honor the command line ones, so we may need to read > system+global+command without reading anything else to grab the > values only from the protected sources (ah, I like the application > of the adjective "protected" to the source, not variables, because > that is what we are really talking about---alternatively we could > call it "safe"). But if we later read local and worktree ones > lazily, unless we _insert_ them before what we read from the command > line, we'll break the last-one-wins property, so we need to be > careful. I guess each configuration value in the configset knows > where it came from, so it probably is possible to insert the ones > you read lazily later in the right spot. Yeah, last-one-wins makes this a lot trickier. I thought that it would be nice to have insert-with-priority because that also eliminates some of the correctness concerns in this series, i.e. that ensures protected config has the same priority as regular config, but that's a bigger undertaking and I'm not certain about the performance. >> 2. Add an additional argument that specifies what scopes to respect when >> reading config (maybe as a set of flags). This gives us extra >> specificity when using the git_config*() functions, so we could get >> rid of git_protected_config() like so: >> >> /* Change enum config_scope into flags first... */ >> >> #define WIP_SCOPES_PROTECTED = CONFIG_SCOPE_SYSTEM & \ >> CONFIG_SCOPE_GLOBAL & CONFIG_SCOPE_COMMAND >> >> static enum discovery_bare_allowed get_discovery_bare(void) >> { >> enum discovery_bare_allowed result = DISCOVERY_BARE_ALWAYS; >> git_config(discovery_bare_cb, &result, WIP_SCOPES_PROTECTED); >> return result; >> } > > Alternatively, we could make the callback aware of the scope for > each var-value it is called and have it filter, but that would be a > bigger surgery. > > I think a new iterator git_config_in_scope(), instead of updating > git_config(), would make sense. By definition, all existing > git_config() callers do not need the scope specifiers, and > "protected" may be the first one but will not be the last one that > needs to read from particular scopes. Makes sense. The signature of git_config() could stay the same, but we could refactor it to use git_config_in_scope().