On Wed, Sep 28 2022, Derrick Stolee wrote: > On 9/27/22 3:18 PM, Ævar Arnfjörð Bjarmason wrote: >> >> On Tue, Sep 27 2022, Derrick Stolee wrote: >> >>> On 9/27/2022 12:21 PM, Ævar Arnfjörð Bjarmason wrote: >>>> >>>> On Tue, Sep 27 2022, Derrick Stolee via GitGitGadget wrote: >>> >>>>> /** >>>>> * Finds and returns the value list, sorted in order of increasing priority >>>>> * for the configuration variable `key`. When the configuration variable >>>>> - * `key` is not found, returns NULL. The caller should not free or modify >>>>> - * the returned pointer, as it is owned by the cache. >>>>> + * `key` is not found, returns an empty list. The caller should not free or >>>>> + * modify the returned pointer, as it is owned by the cache. >>>>> */ >>>>> const struct string_list *git_config_get_value_multi(const char *key); >>>> >>>> Aside from the "DWIM API" aspect of this (which I don't mind) I think >>>> this is really taking the low-level function in the wrong direction, and >>>> that we should just add a new simple wrapper instead. >>>> >>>> I.e. both the pre-image API docs & this series gloss over the fact that >>>> we'd not just return NULL here if the config wasn't there, but also if >>>> git_config_parse_key() failed. >>>> >>>> So it seems to me that a better direction would be starting with >>>> something like the WIP below (which doesn't compile the whole code, I >>>> stopped at config.[ch] and pack-bitmap.c). I.e. the same "int" return >>>> and "dest" pattern that most other things in the config API have. >>> >>> Do you have an example where a caller would benefit from this >>> distinction? Without such an example, I don't think it is worth >>> creating such a huge change for purity's sake alone. >> >> Not initially, I started poking at this because the CL/series/commits >> says that we don't care about the case of non-existing keys, without >> being clear as to why we want to conflate that with other errors we >> might get from this API. >> >> But after some digging I found: >> >> $ for k in a a.b. "'x.y"; do ./git for-each-repo --config=$k; echo $?; done >> error: key does not contain a section: a >> 0 >> error: key does not contain variable name: a.b. >> 0 >> error: invalid key: 'x.y >> 0 >> >> I.e. the repo_config_get_value_multi() you added in for-each-repo >> doesn't distinguish between bad keys and non-existing keys, and returns >> 0 even though it printed an "error". > > I can understand wanting to inform the user that they provided an > invalid key using a nonzero exit code. I can also understand that > the command does what is asked: it did nothing because the given > key has no values (because it can't). I think the use of an "error" > message balances things towards wanting a nonzero exit code. Right, to be clear I think 6c62f015520 (for-each-repo: do nothing on empty config, 2021-01-08) is sensible, i.e. we want to return 0 on a non-existing key. We just shouldn't conflate that with e.g. these parse errors, which the API squashing the underlying negative return values and the "NULL list" imposes on the user. >>> I'm pretty happy that the diff for this series is an overall >>> reduction in code, while also not being too large in the interim: >>> >>> 12 files changed, 39 insertions(+), 57 deletions(-) >>> >>> If all callers that use the *_multi() methods would only use the >>> wrapper, then what is the point of doing the low-level manipulations? >> >> I hacked up something that's at least RFC-quality based on this >> approach, but CI is running etc., so not submitting it >> now: >> >> https://github.com/git/git/compare/master...avar:git:avar/have-git_configset_get_value-use-dest-and-int-pattern >> >> I think the resulting diff is more idiomatic API use, i.e. you ended up >> with: >> >> /* submodule.active is set */ >> sl = repo_config_get_value_multi(repo, "submodule.active"); >> - if (sl) { >> + if (sl && sl->nr) { > > You're right that I forgot to change this one to "if (sl->nr)" > in patch 5. If I am I didn't mean to point that out, I ws just pointing out the end-API use. I.e. int return value v.s. the "populate dest" pattern, but yes, in your end-state you'd drop the "sl &&" part. >> But I ended up doing: >> >> /* submodule.active is set */ >> - sl = repo_config_get_value_multi(repo, "submodule.active"); >> - if (sl) { >> + if (!repo_config_get_const_value_multi(repo, "submodule.active", &sl)) { >> >> Note the "const" in the function name, i.e. there's wrappers that handle >> the case where we have a hardcoded key name, in which case we can BUG() >> out if we'd return < 0, so all we have left is just "does key exist". > > The problem here is that the block actually cares that the list is non-empty > and should not run if the list is empty. In that case, you would need to add > "&& sl->nr" to the condition. > > I'm of course assuming that an empty list is different from an error. In > your for-each-repo example, we would not want to return a non-zero exit > code on an empty list, only on a bad key (or other I/O problem). > > If we return a negative value on an error and the number of matches on > success, then this change could instead be "if (repo_config....() > 0)". Hrm, I think you're confusing the worldview your series here is advocating for, and what I'm suggesting as an alternative. There isn't any way on "master" to have "an empty list", that's a worldview you're proposing. In particular your 1/5 here removes: assert(values->nr > 0); More generally the config format has no notion of "an empty list", if you have a valid key-value pair at all you have a list of ".nr >= 1". The "empty list" is a construct you're introducing in this series, because you wanted the idiom of passing things to for_each_string_list_item. I'm advocating for not going that route, and instead make the *_multi() method like the rest of the config API. I.e. to use the "return int, populate dest" pattern. It's fine if we disagree, but I get the sense that it's not clear what we're disagreeing *on* :) >> In any case, I'm all for having some simple wrapper for the common cases > A simple wrapper would be nice, and be exactly the method as it is > updated in this series. The error-result version could be adopted when > there is reason to do so. Well, no :) We ended up with two different "simple wrapper[s]", mine doesn't have this notion of a "struct string_list *list" with .nr == 0. >> But I didn't find a single case where we actually needed this "never >> give me a non-NULL list" behavior, it could just be generalized to >> "let's have the API tell us if the key exist". > > Most cases want to feed the result into the for_each_string_list_item() > macro. Based on the changes in patch 5, I think the empty list is a > better pattern and leads to prettier code in almost all cases. I updated the WIP RFC series I linked to upthread a bit since my initial reply (the link is still good, I force-pushed), I then rebased your series here on "master", below is a diff of some select files. The overall diff is much bigger obviously (API changes and all), but the below demonstrates some of the API changes (yours is "-", mine is "+"). I've commented inline on some of it: diff --git a/builtin/for-each-repo.c b/builtin/for-each-repo.c index 635ea5e15fd..16e9a76d04a 100644 --- a/builtin/for-each-repo.c +++ b/builtin/for-each-repo.c @@ -29,6 +29,7 @@ int cmd_for_each_repo(int argc, const char **argv, const char *prefix) static const char *config_key = NULL; int i, result = 0; const struct string_list *values; + int err; const struct option options[] = { OPT_STRING(0, "config", &config_key, N_("config"), @@ -42,8 +43,13 @@ int cmd_for_each_repo(int argc, const char **argv, const char *prefix) if (!config_key) die(_("missing --config=<config>")); - values = repo_config_get_value_multi(the_repository, - config_key); + err = repo_config_get_value_multi(the_repository, config_key, &values); + if (err < 0) + usage_msg_optf(_("got bad config --config=%s"), + for_each_repo_usage, options, config_key); + else if (err) + return 0; + for (i = 0; !result && i < values->nr; i++) result = run_command_on_repo(values->items[i].string, argc, argv); Here we're relying an error to the user that we couldn't before, because repo_config_get_value_multi() would return "NULL" for both "key is bad" and "key doesn't exist". There's a corresponding test modification below. diff --git a/builtin/gc.c b/builtin/gc.c index 1e9ac2ac7e3..94b77a88a99 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -1472,9 +1472,7 @@ static int maintenance_register(int argc, const char **argv, const char *prefix) }; int found = 0; const char *key = "maintenance.repo"; - char *config_value; char *maintpath = get_maintpath(); - struct string_list_item *item; const struct string_list *list; argc = parse_options(argc, argv, prefix, options, @@ -1487,18 +1485,11 @@ static int maintenance_register(int argc, const char **argv, const char *prefix) git_config_set("maintenance.auto", "false"); /* Set maintenance strategy, if unset */ - if (!git_config_get_string("maintenance.strategy", &config_value)) - free(config_value); - else + if (git_config_lookup_value("maintenance.strategy")) git_config_set("maintenance.strategy", "incremental"); In looking at this I thought we were way overdue for a "does this key exist?" helper, this and a few other API users use it. - list = git_config_get_value_multi(key); - for_each_string_list_item(item, list) { - if (!strcmp(maintpath, item->string)) { - found = 1; - break; - } - } + if (!git_config_get_const_value_multi(key, &list)) + found = unsorted_string_list_has_string(list, maintpath); So, it turns out that the initial reason you wanted the "pass NULL to for_each_string_list_item" is actually something we can do with unsorted_string_list_has_string(), which implements the same loop. The difference here is *the* API difference we're discussing. I.e. we'll never get a NULL "list", we'll instead always get a non-NULL list with >= 1 item if we can get this key at all. The "const value" helper is a wrapper that handles the "err < 0" case. I cases where we hardcode the key it's a BUG() if we get "err < 0". The wrapper is just: int err = git_configset_get_value_multi(cs, key, dest); if (err < 0) BUG("failed to parse constant key '%s'!", key); return err; [...] @@ -1547,13 +1537,8 @@ static int maintenance_unregister(int argc, const char **argv, const char *prefi usage_with_options(builtin_maintenance_unregister_usage, options); - list = git_config_get_value_multi(key); - for_each_string_list_item(item, list) { - if (!strcmp(maintpath, item->string)) { - found = 1; - break; - } - } + if (!git_config_get_const_value_multi(key, &list)) + found = unsorted_string_list_has_string(list, maintpath); Ditto the same git_config_get_const_value_multi() & unsorted_string_list_has_string() pattern. if (found) { int rc; diff --git a/builtin/log.c b/builtin/log.c index 719ef966045..bdb87f6c42b 100644 --- a/builtin/log.c +++ b/builtin/log.c @@ -182,13 +182,15 @@ static void set_default_decoration_filter(struct decoration_filter *decoration_f int i; char *value = NULL; struct string_list *include = decoration_filter->include_ref_pattern; - struct string_list_item *item; - const struct string_list *config_exclude = - git_config_get_value_multi("log.excludeDecoration"); + const struct string_list *config_exclude; - for_each_string_list_item(item, config_exclude) - string_list_append(decoration_filter->exclude_ref_config_pattern, - item->string); + if (!git_config_get_const_value_multi("log.excludeDecoration", + &config_exclude)) { + struct string_list_item *item; + for_each_string_list_item(item, config_exclude) + string_list_append(decoration_filter->exclude_ref_config_pattern, + item->string); + } Here's a case where we need to use for_each_string_list_item(), I think it's nice how we can now scope the "item" variable. /* * By default, decorate_all is disabled. Enable it if diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c index 5a8b6120157..b758255f816 100644 --- a/builtin/submodule--helper.c +++ b/builtin/submodule--helper.c @@ -552,7 +552,7 @@ static int module_init(int argc, const char **argv, const char *prefix) * If there are no path args and submodule.active is set then, * by default, only initialize 'active' modules. */ - if (!argc && git_config_get_value_multi("submodule.active")->nr) + if (!argc && !git_config_lookup_value("submodule.active")) module_list_active(&list); You changed these in your 2/5, but they really just wanted the new "does this key exist?" API. No need to construct the string_list just to throw it away... info.prefix = prefix; @@ -2720,7 +2720,7 @@ static int module_update(int argc, const char **argv, const char *prefix) * If there are no path args and submodule.active is set then, * by default, only initialize 'active' modules. */ - if (!argc && git_config_get_value_multi("submodule.active")->nr) + if (!argc && !git_config_lookup_value("submodule.active")) module_list_active(&list); Ditto. info.prefix = opt.prefix; @@ -3164,7 +3164,6 @@ static int config_submodule_in_gitmodules(const char *name, const char *var, con static void configure_added_submodule(struct add_data *add_data) { char *key; - const char *val; struct child_process add_submod = CHILD_PROCESS_INIT; struct child_process add_gitmodules = CHILD_PROCESS_INIT; @@ -3209,7 +3208,7 @@ static void configure_added_submodule(struct add_data *add_data) * is_submodule_active(), since that function needs to find * out the value of "submodule.active" again anyway. */ - if (!git_config_get_string_tmp("submodule.active", &val)) { + if (!git_config_lookup_value("submodule.active")) { /* * If the submodule being added isn't already covered by the * current configured pathspec, set the submodule's active flag Ditto. diff --git a/submodule.c b/submodule.c index 06230961c80..4474cf9ed2d 100644 --- a/submodule.c +++ b/submodule.c @@ -274,8 +274,7 @@ int is_tree_submodule_active(struct repository *repo, free(key); /* submodule.active is set */ - sl = repo_config_get_value_multi(repo, "submodule.active"); - if (sl && sl->nr) { + if (!repo_config_get_const_value_multi(repo, "submodule.active", &sl)) { struct pathspec ps; struct strvec args = STRVEC_INIT; const struct string_list_item *item; Another "*the* API difference we're discussing". I.e. sure, your end state would be "if (sl->nr)", but if we make it return "int"... diff --git a/t/helper/test-config.c b/t/helper/test-config.c index 90810946783..432ad047537 100644 --- a/t/helper/test-config.c +++ b/t/helper/test-config.c @@ -95,8 +95,7 @@ int cmd__config(int argc, const char **argv) goto exit1; } } else if (argc == 3 && !strcmp(argv[1], "get_value_multi")) { - strptr = git_config_get_value_multi(argv[2]); - if (strptr->nr) { + if (!git_config_get_const_value_multi(argv[2], &strptr)) { for (i = 0; i < strptr->nr; i++) { v = strptr->items[i].string; if (!v) Ditto, (this one converts away from your preferred API use). @@ -159,8 +158,7 @@ int cmd__config(int argc, const char **argv) goto exit2; } } - strptr = git_configset_get_value_multi(&cs, argv[2]); - if (strptr && strptr->nr) { + if (!git_configset_get_const_value_multi(&cs, argv[2], &strptr)) { for (i = 0; i < strptr->nr; i++) { v = strptr->items[i].string; if (!v) Ditto, sans that you'd presumably want s/strptr && // here. diff --git a/t/t0068-for-each-repo.sh b/t/t0068-for-each-repo.sh index 4675e852517..115221c9ca5 100755 --- a/t/t0068-for-each-repo.sh +++ b/t/t0068-for-each-repo.sh @@ -33,4 +33,10 @@ test_expect_success 'do nothing on empty config' ' git for-each-repo --config=bogus.config -- help --no-such-option ' +test_expect_success 'error on bad config keys' ' + test_expect_code 129 git for-each-repo --config=a && + test_expect_code 129 git for-each-repo --config=a.b. && + test_expect_code 129 git for-each-repo --config="'\''.b" +' + test_done A test showing behavior change we can implement now that we don't sweep the "err < 0" under the rug. That branch also grew to have some other changes we may or may not want, one thing was to convert the various *_get_*() functionts that now normalize the non-zero return value with e.g.: int git_configset_get_int(struct config_set *cs, const char *key, int *dest) { const char *value; - if (!git_configset_get_value(cs, key, &value)) { - *dest = git_config_int(key, value); - return 0; - } else - return 1; + int err; + + if ((err = git_configset_get_value(cs, key, &value))) + return err; + *dest = git_config_int(key, value); + return 0; } No caller currently cares about it, but I think it makes sense generally not to throw away errors if we can (whether that part is worth the churn is another topic). Anyway, the reason I started looking at this RFC to begin with was because this *_multi() part of the config API has often seemed odd to me, i.e. I wondered why we couldn't just have it use the "return int, populate dest" pattern. I'd just never tried to see if I could get that to work. It's a bit of one-off churn to get to this point, but I think the end result of having all the API functions act the same way to signal key existence v.s. validity is worth it.