A change recently merged to 'next' stops us from defaulting to using SHA-1 unless other code (like a logic early in the start-up sequence to see what hash is being used in the repository we are working in) explicitly sets it, leading to a (deliberate) crash of "git" when we forgot to cover certain code paths. It turns out we have a few. Notable ones are all operations that are designed to work outside a repository. We should go over all such code paths and give them a reasonable default when there is one available (e.g. for historical reasons, patch-id is documented to work with SHA-1 hashes, so arguably it, or at least when it is invoked with the "--stable" option, should do so everywhere, not just in SHA-1 repositories, but in SHA-256 repository or outside any repository). In the meantime, if an end-user hits such a "bug" before we can fix it, it would be nice to give them an escape hatch to restore the historical behaviour of falling back to use SHA-1. These patches are designed to apply on a merge of c8aed5e8 (repository: stop setting SHA1 as the default object hash, 2024-05-07) into 3e4a232f (The third batch, 2024-05-13), which has been the same base throughout the past iterations. In this fifth iteration: - The first step no longer falls back to GIT_DEFAULT_HASH; the escape hatch is a dedicated GIT_TEST_DEFAULT_HASH_ALGO environment variable, but hopefully we do not have to advertise it all that often. - The second step has been simplified somewhat to use the "nongit" helper when we only need to run a single "git" command in t1517. The way the expected output files were prepared in the previous versions did not correctly force use of SHA-1 algorithm, which has been corrected. The third step and fourth step for t1517 continue to be "flip expect_failure to expect_success", but you can see context differences in the range-diff. - The fourth step also has a fix for t1007 where the previous iterations did not correctly force use of SHA-1 to prepare the expected output. Otherwise this round should be ready, modulo possible typoes. Junio C Hamano (3): setup: add an escape hatch for "no more default hash algorithm" change t1517: test commands that are designed to be run outside repository apply: fix uninitialized hash function Patrick Steinhardt (2): builtin/patch-id: fix uninitialized hash function builtin/hash-object: fix uninitialized hash function builtin/apply.c | 4 +++ builtin/hash-object.c | 3 +++ builtin/patch-id.c | 13 +++++++++ repository.c | 44 ++++++++++++++++++++++++++++++ t/t1007-hash-object.sh | 6 +++++ t/t1517-outside-repo.sh | 59 +++++++++++++++++++++++++++++++++++++++++ t/t4204-patch-id.sh | 34 ++++++++++++++++++++++++ 7 files changed, 163 insertions(+) create mode 100755 t/t1517-outside-repo.sh -- 2.45.1-216-g4365c6fcf9 1: 3038f73e1c ! 1: f5e6868a61 setup: add an escape hatch for "no more default hash algorithm" change @@ Commit message Partially revert c8aed5e8 (repository: stop setting SHA1 as the default object hash, 2024-05-07), to keep end-user systems still broken when we have gap in our test coverage but yet give them an - escape hatch to set the GIT_DEFAULT_HASH environment variable to - "sha1" in order to revert to the previous behaviour. This variable - has been in use for using SHA-256 hash by default, and it should be - a better fit than inventing a new and test-only knob. + escape hatch to set the GIT_TEST_DEFAULT_HASH_ALGO environment + variable to "sha1" in order to revert to the previous behaviour. - Signed-off-by: Junio C Hamano <gitster@xxxxxxxxx> + Due to the way the end-user facing GIT_DEFAULT_HASH environment + variable is used in our test suite, we unfortunately cannot reuse it + for this purpose. - ## environment.h ## -@@ environment.h: const char *getenv_safe(struct strvec *argv, const char *name); - #define GIT_OPTIONAL_LOCKS_ENVIRONMENT "GIT_OPTIONAL_LOCKS" - #define GIT_TEXT_DOMAIN_DIR_ENVIRONMENT "GIT_TEXTDOMAINDIR" - #define GIT_ATTR_SOURCE_ENVIRONMENT "GIT_ATTR_SOURCE" -+#define GIT_DEFAULT_HASH_ENVIRONMENT "GIT_DEFAULT_HASH" - - /* - * Environment variable used in handshaking the wire protocol. + Signed-off-by: Junio C Hamano <gitster@xxxxxxxxx> ## repository.c ## -@@ - #include "git-compat-util.h" - #include "abspath.h" -+#include "environment.h" - #include "repository.h" - #include "object-store-ll.h" - #include "config.h" @@ static struct repository the_repo; struct repository *the_repository = &the_repo; ++/* ++ * An escape hatch: if we hit a bug in the production code that fails ++ * to set an appropriate hash algorithm (most likely to happen when ++ * running outside a repository), we can tell the user who reported ++ * the crash to set the environment variable to "sha1" (all lowercase) ++ * to revert to the historical behaviour of defaulting to SHA-1. ++ */ +static void set_default_hash_algo(struct repository *repo) +{ + const char *hash_name; + int algo; + -+ hash_name = getenv(GIT_DEFAULT_HASH_ENVIRONMENT); ++ hash_name = getenv("GIT_TEST_DEFAULT_HASH_ALGO"); + if (!hash_name) + return; + algo = hash_algo_by_name(hash_name); -+ -+ /* -+ * NEEDSWORK: after all, falling back to SHA-1 by assigning -+ * GIT_HASH_SHA1 to algo here, instead of returning, may give -+ * us better behaviour. -+ */ + if (algo == GIT_HASH_UNKNOWN) + return; + @@ repository.c: void initialize_repository(struct repository *repo) + * of the hashed value to see if their input looks like a + * plausible hash value. + * -+ * We are in the process of identifying the codepaths and ++ * We are in the process of identifying such code paths and + * giving them an appropriate default individually; any -+ * unconverted codepath that tries to access the_hash_algo ++ * unconverted code paths that try to access the_hash_algo + * will thus fail. The end-users however have an escape hatch -+ * to set GIT_DEFAULT_HASH environment variable to "sha1" get -+ * back the old behaviour of defaulting to SHA-1. ++ * to set GIT_TEST_DEFAULT_HASH_ALGO environment variable to ++ * "sha1" to get back the old behaviour of defaulting to SHA-1. ++ * ++ * This escape hatch is deliberately kept unadvertised, so ++ * that they see crashes and we can get a report before ++ * telling them about it. + */ + if (repo == the_repository) + set_default_hash_algo(repo); } static void expand_base_dir(char **out, const char *in, - - ## setup.c ## -@@ setup.c: int daemonize(void) - #define TEST_FILEMODE 1 - #endif - --#define GIT_DEFAULT_HASH_ENVIRONMENT "GIT_DEFAULT_HASH" -- - static void copy_templates_1(struct strbuf *path, struct strbuf *template_path, - DIR *dir) - { 2: e56ae68961 ! 2: f3f9bf0480 t1517: test commands that are designed to be run outside repository @@ t/t1517-outside-repo.sh (new) + git diff >sample.patch +' + -+test_expect_failure 'compute a patch-id outside repository' ' -+ git patch-id <sample.patch >patch-id.expect && -+ ( -+ cd non-repo && -+ git patch-id <../sample.patch >../patch-id.actual -+ ) && ++test_expect_failure 'compute a patch-id outside repository (uses SHA-1)' ' ++ nongit env GIT_DEFAULT_HASH=sha1 \ ++ git patch-id <sample.patch >patch-id.expect && ++ nongit \ ++ git patch-id <sample.patch >patch-id.actual && + test_cmp patch-id.expect patch-id.actual +' + -+test_expect_failure 'hash-object outside repository' ' -+ git hash-object --stdin <sample.patch >hash.expect && -+ ( -+ cd non-repo && -+ git hash-object --stdin <../sample.patch >../hash.actual -+ ) && ++test_expect_failure 'hash-object outside repository (uses SHA-1)' ' ++ nongit env GIT_DEFAULT_HASH=sha1 \ ++ git hash-object --stdin <sample.patch >hash.expect && ++ nongit \ ++ git hash-object --stdin <sample.patch >hash.actual && + test_cmp hash.expect hash.actual +' + 3: e1ae0e95fc ! 3: 246cf395d0 builtin/patch-id: fix uninitialized hash function @@ t/t1517-outside-repo.sh: test_expect_success 'set up a non-repo directory and te git diff >sample.patch ' --test_expect_failure 'compute a patch-id outside repository' ' -+test_expect_success 'compute a patch-id outside repository' ' - git patch-id <sample.patch >patch-id.expect && - ( - cd non-repo && +-test_expect_failure 'compute a patch-id outside repository (uses SHA-1)' ' ++test_expect_success 'compute a patch-id outside repository (uses SHA-1)' ' + nongit env GIT_DEFAULT_HASH=sha1 \ + git patch-id <sample.patch >patch-id.expect && + nongit \ ## t/t4204-patch-id.sh ## @@ t/t4204-patch-id.sh: test_expect_success 'patch-id handles diffs with one line of before/after' ' 4: 1c8ce0d024 ! 4: fe14490548 builtin/hash-object: fix uninitialized hash function @@ t/t1007-hash-object.sh: test_expect_success '--literally with extra-long type' ' echo example | git hash-object -t $t --literally --stdin ' -+test_expect_success '--stdin outside of repository' ' ++test_expect_success '--stdin outside of repository (uses SHA-1)' ' + nongit git hash-object --stdin <hello >actual && -+ echo "$(test_oid hello)" >expect && ++ echo "$(test_oid --hash=sha1 hello)" >expect && + test_cmp expect actual +' + test_done ## t/t1517-outside-repo.sh ## -@@ t/t1517-outside-repo.sh: test_expect_success 'compute a patch-id outside repository' ' +@@ t/t1517-outside-repo.sh: test_expect_success 'compute a patch-id outside repository (uses SHA-1)' ' test_cmp patch-id.expect patch-id.actual ' --test_expect_failure 'hash-object outside repository' ' -+test_expect_success 'hash-object outside repository' ' - git hash-object --stdin <sample.patch >hash.expect && - ( - cd non-repo && +-test_expect_failure 'hash-object outside repository (uses SHA-1)' ' ++test_expect_success 'hash-object outside repository (uses SHA-1)' ' + nongit env GIT_DEFAULT_HASH=sha1 \ + git hash-object --stdin <sample.patch >hash.expect && + nongit \ 5: 5b0ec43757 ! 5: c6d5e68374 apply: fix uninitialized hash function @@ builtin/apply.c: int cmd_apply(int argc, const char **argv, const char *prefix) apply_usage); ## t/t1517-outside-repo.sh ## -@@ t/t1517-outside-repo.sh: test_expect_success 'hash-object outside repository' ' +@@ t/t1517-outside-repo.sh: test_expect_success 'hash-object outside repository (uses SHA-1)' ' test_cmp hash.expect hash.actual '