On Tue, Mar 07 2023, Glen Choo wrote: > Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> writes: > >> On Wed, Mar 01 2023, Glen Choo via GitGitGadget wrote: >> >>> This series extracts the global config reading state into "struct >>> config_reader" and plumbs it through the config reading machinery. It's very >>> similar to how we've plumbed "struct repository" and other 'context objects' >>> in the past, except: >>> >>> * The global state (named "the_reader") for the git process lives in a >>> config.c static variable, and not on "the_repository". See 3/6 for the >>> rationale. >> >> I agree with the overall direction, but don't think that rationale in >> 3/6 is sufficient to go in this "the_reader" direction, as opposed to >> sticking with and extending "the_repository" approach. >> >> For orthagonal reasons (getting rid of some of the API duplication) I've >> been carrying a patch to get rid of the "configset" part of the *public* >> API, i.e. to have API users always use the "repo_config_*()" or >> "git_config_*()" variants, that patch is at: >> https://github.com/avar/git/commit/0233297a359bbda43a902dd0213aacdca82faa34 > > Those patches are probably worth sending, even if only as RFC. I found > it pretty hard to draft a substantial response without effectively doing > a full review of the patch. Yes, sorry. It's part of some changes on top of my outstanding config API changes (just re-rolled at https://lore.kernel.org/git/cover-v6-0.9-00000000000-20230307T180516Z-avarab@xxxxxxxxx/). Hopefully those will land soon after the upcoming release, I'll try to submit this (along with related changes) soon afterwards. >> It's a bit distasteful, but that change argues that just mocking up a >> "struct repository" with a "config" member pointing to a new >> configset is better than maintaining an entirely different API just >> for those cases where we need to parse a one-off file or whatever. >> >> I think that going in that direction neatly solves the issues you're >> noting here and in your 3/6, i.e. we'd always have this in the "repo" >> object, so we'd just stick the persistent "reader" variables in the >> "struct repository"'s "config" member. > > If I understand your proposal correctly, we would move the config > variables to the_repository. Then, any time a caller would like to work > with an individual file, it would init a new "struct repository" with a > clean set of config members (using repo_init_repo_blank_config() or > something) and reuse the repo_config_* API? It's certainly a hack, but so is introducing a new "the_reader" singleton whose lifetime we need to manage seperately from "the_repository" in the common case :) I think a better argument for this is probably that if you try to change repository.h so that we define "struct repository" thusly (The "hash_algo" field being still there due to a very common macro): struct repository { const struct git_hash_algo *hash_algo; struct config_set *config; }; And then try to: make config.o You'll get: $ make config.o CC config.o config.c: In function ‘include_by_branch’: config.c:311:46: error: ‘struct repository’ has no member named ‘gitdir’ 311 | const char *refname = !the_repository->gitdir ? | ^~ config.c: In function ‘repo_read_config’: config.c:2523:30: error: ‘struct repository’ has no member named ‘commondir’ 2523 | opts.commondir = repo->commondir; | ^~ config.c:2524:28: error: ‘struct repository’ has no member named ‘gitdir’ 2524 | opts.git_dir = repo->gitdir; | ^~ make: *** [Makefile:2719: config.o] Error 1 I.e. almost all of the config code doesn't care about the repository at all, it just needs the "struct config_set" that's in the repository struct. With the linked-to change I'm arguing that just mocking it up sucks less than carrying a duplicate set of API functions just for those rare cases where we need to one-off read a config file. But... > It is a workable solution, e.g. that approach would work around the > failures in test-tool and scalar that I observed. In the spirit of > libification, this feels like a kludge, though, since we'd be reverting > to using "struct repository" for more things instead of using more > well-scoped interfaces. IMO a better future for the config_set API would > be to move it into configset.c or something, where only users who want > the low level API would use it and everyone else would just pretend it > doesn't exist. This would be a little like libgit2's organization, where > 'general config', 'config parsing' and 'in-memory config value > representations' are separate files, e.g. > > https://github.com/libgit2/libgit2/blob/main/src/libgit2/config.h > https://github.com/libgit2/libgit2/blob/main/src/libgit2/config_parse.h > https://github.com/libgit2/libgit2/blob/main/src/libgit2/config_entries.h > > I also hesitate to put the config variables on the_repository, because > in the long term, I think "struct config_reader" can and should be > purely internal to config.c. But if we start advertising its existence > via the_repository, that might be an invitation to (ab)use that API and > make that transition harder. ...yes it's a kludge, but I'd think if you're interested in more generally fixing it that a better use of time would be to narrowly focus on those cases where we don't have a "repository" now, i.e. the configset API users my linked-to patch shows & amends. Everything else that uses git_config_get_*(), repo_config_get_*() will either use an already set up "the_repository", or lazily init it, see the git_config_check_init() API users. Or, we could have repo_config() and friends not take a "struct repository", but another config-specific object which has the subset of the three fields from that struct which it actually needs (config, gitdir & commondir). The mocking function I added in the linked-to commit was just a way of getting to that via the shortest means.