This series fixes numerous segfaults in config API users, because they didn't expect *_get_multi() to hand them a string_list with a NULL in it given config like "[a] key" (note, no "="'s). A larger general overview at v1[1], but note the API changes in v2[2]. Changes since v4[3]: * Added tests for value-less at the end of a list to 2/10, per Junio's request. * Clarifications in the 3/10 commit message, per v4's discussion. * Add RESULT_MUST_BE_USED to configset_find_element(), note that configset_add_value() doesn't have it, see the 3/10 commit message update. * 3/10 now has tests for the "get" family of functions, this should clarify any questions about the semantics of the API. * Junio suggested for 4/10 to skip renaming the "deprecated_prereleases" variable, I tried that, and if we keep it we need to wrap a line later in the series, which makes the versioncmp.c code harder to read. So I kept the renaming as it's refactored in 4/10. * Glen suggested that the new *_get() family should "return 1" on non-zero like the rest of *_get_*() (i.e. coerce 'ret < 0' to 'return 1'). That could be done here, but for the later "for-each-repo.c" we need to distinguish "bad key" v.s. "does this exist?", and just having that API return a more meaningful value would make it inconsistent with the rest. As the much of the point of this series is to make that API less of a special snowflake a new 6/10 instead finishes up the work of having most of the rest of the API return the un-coerced "ret" from the depths of the config API. That patch is quite large by line count, but pretty trivial in complexity. All of those functions are copy/pasted versions of one another with very minor variations. * Updated the 8/10 commit message, which was stale from a previous version of this topic. Branch & CI for this at: https://github.com/avar/git/tree/avar/have-git_configset_get_value-use-dest-and-int-pattern-5 1. https://lore.kernel.org/git/cover-00.10-00000000000-20221026T151328Z-avarab@xxxxxxxxx/ 2. https://lore.kernel.org/git/cover-v2-0.9-00000000000-20221101T225822Z-avarab@xxxxxxxxx/ 3. https://lore.kernel.org/git/cover-v4-0.9-00000000000-20230202T131155Z-avarab@xxxxxxxxx/ Ævar Arnfjörð Bjarmason (10): config tests: cover blind spots in git_die_config() tests config tests: add "NULL" tests for *_get_value_multi() config API: add and use a "git_config_get()" family of functions versioncmp.c: refactor config reading next commit config API: have *_multi() return an "int" and take a "dest" config API: don't lose the git_*get*() return values for-each-repo: error on bad --config config API users: test for *_get_value_multi() segfaults config API: add "string" version of *_value_multi(), fix segfaults for-each-repo: with bad config, don't conflate <path> and <cmd> builtin/for-each-repo.c | 14 +- builtin/gc.c | 15 +- builtin/log.c | 6 +- builtin/submodule--helper.c | 7 +- builtin/worktree.c | 3 +- config.c | 226 ++++++++++++++++++--------- config.h | 84 ++++++++-- pack-bitmap.c | 6 +- submodule.c | 3 +- t/helper/test-config.c | 28 +++- t/t0068-for-each-repo.sh | 19 +++ t/t1308-config-set.sh | 108 ++++++++++++- t/t3309-notes-merge-auto-resolve.sh | 7 +- t/t4202-log.sh | 15 ++ t/t5304-prune.sh | 12 +- t/t5310-pack-bitmaps.sh | 20 +++ t/t5552-skipping-fetch-negotiator.sh | 16 ++ t/t7004-tag.sh | 17 ++ t/t7413-submodule-is-active.sh | 16 ++ t/t7900-maintenance.sh | 38 +++++ versioncmp.c | 22 ++- 21 files changed, 549 insertions(+), 133 deletions(-) Range-diff against v4: 1: 4ae56cab7c7 = 1: cefc4188984 config tests: cover blind spots in git_die_config() tests 2: 1f0f8bdcde9 ! 2: 91a44456327 config tests: add "NULL" tests for *_get_value_multi() @@ t/t1308-config-set.sh: test_expect_success 'find multiple values' ' + mv "$config" "$config".old + fi && + ++ # Value-less in the middle of a list + cat >"$config" <<-\EOF && + [a]key=x + [a]key @@ t/t1308-config-set.sh: test_expect_success 'find multiple values' ' + ;; + esac && + test-tool config "$op" a.key $file >actual && ++ test_cmp expect actual && ++ ++ # Value-less at the end of a least ++ cat >"$config" <<-\EOF && ++ [a]key=x ++ [a]key=y ++ [a]key ++ EOF ++ case "$op" in ++ *_multi) ++ cat >expect <<-\EOF ++ x ++ y ++ (NULL) ++ EOF ++ ;; ++ *) ++ cat >expect <<-\EOF ++ (NULL) ++ EOF ++ ;; ++ esac && ++ test-tool config "$op" a.key $file >actual && + test_cmp expect actual + ' +} 3: 998b11ae4bc ! 3: 4a73151abde config API: add and use a "git_config_get()" family of functions @@ Commit message can now use a helper function that doesn't require a throwaway variable. - We could have changed git_configset_get_value_multi() to accept a - "NULL" as a "dest" for all callers, but let's avoid changing the - behavior of existing API users. Having an "unused" value that we throw - away internal to config.c is cheap. + We could have changed git_configset_get_value_multi() (and then + git_config_get_value() etc.) to accept a "NULL" as a "dest" for all + callers, but let's avoid changing the behavior of existing API + users. Having an "unused" value that we throw away internal to + config.c is cheap. + + A "NULL as optional dest" pattern is also more fragile, as the intent + of the caller might be misinterpreted if he were to accidentally pass + "NULL", e.g. when "dest" is passed in from another function. Another name for this function could have been "*_config_key_exists()", as suggested in [1]. That would work for all @@ Commit message commit where the git_configset_get_value_multi() function itself will expose this new return value. - 1. https://lore.kernel.org/git/xmqqczadkq9f.fsf@gitster.g/ + This still leaves various inconsistencies and clobbering or ignoring + of the return value in place. E.g here we're modifying + configset_add_value(), but ever since it was added in [2] we've been + ignoring its "int" return value, but as we're changing the + configset_find_element() it uses, let's have it faithfully ferry that + "ret" along. + + Let's also use the "RESULT_MUST_BE_USED" macro introduced in [3] to + assert that we're checking the return value of + configset_find_element(). + + We're leaving the same change to configset_add_value() for some future + series. Once we start paying attention to its return value we'd need + to ferry it up as deep as do_config_from(), and would need to make + least read_{,very_}early_config() and git_protected_config() return an + "int" instead of "void". Let's leave that for now, and focus on + the *_get_*() functions. + + In a subsequent commit we'll fix the other *_get_*() functions to so + that they'll ferry our underlying "ret" along, rather than normalizing + it to a "return 1". But as an intermediate step to that we'll need to + fix git_configset_get_value_multi() to return "int", and that change + itself is smaller because of this change to migrate some callers away + from the *_value_multi() API. + + 1. 3c8687a73ee (add `config_set` API for caching config-like files, 2014-07-28) + 2. https://lore.kernel.org/git/xmqqczadkq9f.fsf@gitster.g/ + 3. 1e8697b5c4e (submodule--helper: check repo{_submodule,}_init() + return values, 2022-09-01), Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> @@ config.c: void read_very_early_config(config_fn_t cb, void *data) } -static struct config_set_element *configset_find_element(struct config_set *cs, const char *key) ++RESULT_MUST_BE_USED +static int configset_find_element(struct config_set *cs, const char *key, + struct config_set_element **dest) { @@ config.c: void git_config_clear(void) ## config.h ## @@ config.h: void git_configset_clear(struct config_set *cs); - - /* - * These functions return 1 if not found, and 0 if found, leaving the found -- * value in the 'dest' pointer. -+ * value in the 'dest' pointer (if any). + * value in the 'dest' pointer. */ +RESULT_MUST_BE_USED @@ config.h: void git_protected_config(config_fn_t fn, void *data); + * documentation. */ ++RESULT_MUST_BE_USED +int git_config_get(const char *key); + /** * Finds the highest-priority value for the configuration variable `key`, * stores the pointer to it in `value` and returns 0. When the + + ## t/helper/test-config.c ## +@@ + * get_value_multi -> prints all values for the entered key in increasing order + * of priority + * ++ * get -> print return value for the entered key ++ * + * get_int -> print integer value for the entered key or die + * + * get_bool -> print bool value for the entered key or die +@@ t/helper/test-config.c: int cmd__config(int argc, const char **argv) + printf("Value not found for \"%s\"\n", argv[2]); + goto exit1; + } ++ } else if (argc == 3 && !strcmp(argv[1], "get")) { ++ int ret; ++ ++ if (!(ret = git_config_get(argv[2]))) ++ goto exit0; ++ else if (ret == 1) ++ printf("Value not found for \"%s\"\n", argv[2]); ++ else if (ret == -CONFIG_INVALID_KEY) ++ printf("Key \"%s\" is invalid\n", argv[2]); ++ else if (ret == -CONFIG_NO_SECTION_OR_NAME) ++ printf("Key \"%s\" has no section\n", argv[2]); ++ else ++ /* ++ * A normal caller should just check "ret < ++ * 0", but for our own tests let's BUG() if ++ * our whitelist of git_config_parse_key() ++ * return values isn't exhaustive. ++ */ ++ BUG("Key \"%s\" has unknown return %d", argv[2], ret); ++ goto exit1; + } else if (argc == 3 && !strcmp(argv[1], "get_int")) { + if (!git_config_get_int(argv[2], &val)) { + printf("%d\n", val); + + ## t/t1308-config-set.sh ## +@@ t/t1308-config-set.sh: test_expect_success 'setup default config' ' + skin = false + nose = 1 + horns ++ [value] ++ less + EOF + ' + +@@ t/t1308-config-set.sh: test_expect_success 'find value with the highest priority' ' + check_config get_value case.baz "hask" + ' + ++test_expect_success 'return value for an existing key' ' ++ test-tool config get lamb.chop >out 2>err && ++ test_must_be_empty out && ++ test_must_be_empty err ++' ++ ++test_expect_success 'return value for value-less key' ' ++ test-tool config get value.less >out 2>err && ++ test_must_be_empty out && ++ test_must_be_empty err ++' ++ ++test_expect_success 'return value for a missing key' ' ++ cat >expect <<-\EOF && ++ Value not found for "missing.key" ++ EOF ++ test_expect_code 1 test-tool config get missing.key >actual 2>err && ++ test_cmp actual expect && ++ test_must_be_empty err ++' ++ ++test_expect_success 'return value for a bad key: CONFIG_INVALID_KEY' ' ++ cat >expect <<-\EOF && ++ Key "fails.iskeychar.-" is invalid ++ EOF ++ test_expect_code 1 test-tool config get fails.iskeychar.- >actual 2>err && ++ test_cmp actual expect && ++ test_must_be_empty out ++' ++ ++test_expect_success 'return value for a bad key: CONFIG_NO_SECTION_OR_NAME' ' ++ cat >expect <<-\EOF && ++ Key "keynosection" has no section ++ EOF ++ test_expect_code 1 test-tool config get keynosection >actual 2>err && ++ test_cmp actual expect && ++ test_must_be_empty out ++' ++ + test_expect_success 'find integer value for a key' ' + check_config get_int lamb.chop 65 + ' +@@ t/t1308-config-set.sh: test_expect_success 'proper error on error in default config files' ' + cp .git/config .git/config.old && + test_when_finished "mv .git/config.old .git/config" && + echo "[" >>.git/config && +- echo "fatal: bad config line 34 in file .git/config" >expect && ++ echo "fatal: bad config line 36 in file .git/config" >expect && + test_expect_code 128 test-tool config get_value foo.bar 2>actual && + test_cmp expect actual + ' 4: aae1d5c12a9 ! 4: 382a77ca69e versioncmp.c: refactor config reading next commit @@ Commit message the bounds of our CodingGuidelines when it comes to line length, and to avoid repeating ourselves. + Renaming "deprecated_prereleases" to "oldl" doesn't help us to avoid + line wrapping now, but it will in a subsequent commit. + Let's also split out the names of the config variables into variables - of our own, so we don't have to repeat ourselves, and refactor the - nested if/else to avoid indenting it, and the existing bracing style - issue. + of our own, and refactor the nested if/else to avoid indenting it, and + the existing bracing style issue. This all helps with the subsequent commit, where we'll need to start checking different git_config_get_value_multi() return value. See 5: 23449ff2c4e ! 5: 8f17bf8150c config API: have *_multi() return an "int" and take a "dest" @@ Commit message return an "int" and populate a "**dest" parameter like every other git_configset_get_*()" in the API. - As we'll see in in subsequent commits this fixes a blind spot in the - API where it wasn't possible to tell whether a list was empty from - whether a config key existed. We'll take advantage of that in - subsequent commits, but for now we're faithfully converting existing - API callers. - - A logical follow-up to this would be to change the various "*_get_*()" - functions to ferry the git_configset_get_value() return value to their - own callers, e.g. git_configset_get_int() returns "1" rather than - ferrying up the "-1" that "git_configset_get_value()" might return, - but that's not being done in this series + As we'll take advantage of in subsequent commits, this fixes a blind + spot in the API where it wasn't possible to tell whether a list was + empty from whether a config key existed. For now we don't make use of + those new return values, but faithfully convert existing API users. Most of this is straightforward, commentary on cases that stand out: -: ----------- > 6: b515ff13f9b config API: don't lose the git_*get*() return values 6: 17c1218e74c = 7: 8a83c30ea78 for-each-repo: error on bad --config 7: 7fc91eaf747 = 8: d9abc78c2be config API users: test for *_get_value_multi() segfaults 8: a391ee17617 ! 9: 65fa91e7ce7 config API: add "string" version of *_value_multi(), fix segfaults @@ Commit message - 92156291ca8 (log: add default decoration filter, 2022-08-05) - 50a044f1e40 (gc: replace config subprocesses with API calls, 2022-09-27) - There are now three remaining files using the low-level API: - - - Two cases in "builtin/submodule--helper.c", where it's used safely - to see if any config exists. - - We could refactor these away from "multi" to some "does it exist?" - function, as [4] did, but as that's orthogonal to the "string" - safety we're introducing here let's leave them for now. + There are now two users ofthe low-level API: - One in "builtin/for-each-repo.c", which we'll convert in a subsequent commit. - - The "t/helper/test-config.c" code added in [4]. + - The "t/helper/test-config.c" code added in [3]. As seen in the preceding commit we need to give the "t/helper/test-config.c" caller these "NULL" entries. @@ Commit message 2008-02-11) 2. 6c47d0e8f39 (config.c: guard config parser from value=NULL, 2008-02-11). - 3. https://lore.kernel.org/git/patch-07.10-c01f7d85c94-20221026T151328Z-avarab@xxxxxxxxx/ - 4. 4c715ebb96a (test-config: add tests for the config_set API, + 3. 4c715ebb96a (test-config: add tests for the config_set API, 2014-07-28) Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> 9: c7a5f5b4133 = 10: 4db3c6d0ed9 for-each-repo: with bad config, don't conflate <path> and <cmd> -- 2.39.1.1430.gb2471c0aaf4