On Fri, Feb 18 2022, Jonathan Tan wrote: > e77aa336f1 ("ls-files: optionally recurse into submodules", 2016-10-10) > taught ls-files the --recurse-submodules argument, but only in a limited > set of circumstances. In particular, --stage was unsupported, perhaps > because there was no repo_find_unique_abbrev(), which was only > introduced in 8bb95572b0 ("sha1-name.c: add > repo_find_unique_abbrev_r()", 2019-04-16). This function is needed for > using --recurse-submodules with --stage. > > Now that we have repo_find_unique_abbrev(), teach support for this > combination of arguments. > > Signed-off-by: Jonathan Tan <jonathantanmy@xxxxxxxxxx> > --- > I got the similar-hashing object contents from Ævar's work in [1]. Hah! FWIW that was made by this script I hacked up at the time: #!/usr/bin/env perl use v5.32.0; use strict; use warnings; use Digest::SHA qw(sha1_hex sha256_hex); # Usage: ## prefix= type=bad git find-colliding-hashes | tee garbage-coll-bad.txt ## prefix= type=bad want=bad git find-colliding-hashes | tee garbage-coll-bad.txt $| = 1; my $s = $ENV{s} // "s"; my %seen; my $type = $ENV{type} // "blob"; my $prefix = $ENV{prefix} // ""; my $want = $ENV{want} // ""; while ($s++) { my $str = $prefix . $s; my $l = length($str) + 1; my $p = "$type $l\0$str\n"; my $o = sha1_hex($p); next if length $want && index($o, $want) != 0; my $n = sha256_hex($p); my $os = substr($o, 0, 4); my $ns = substr($n, 0, 4); if ($os eq $ns) { say "hash($str) = [$os, $ns]" . ($seen{$os} ? " SEEN" : ""); $seen{$os} = 1; } } https://gist.github.com/avar/9e4c2bde7fbdc888b031713065a9eaf6 has some more colliding blob prefixes, which I generated until I got bored with it... > +test_expect_success '--stage' ' > + # In order to test hash abbreviation, write two objects that have the > + # same first 4 hexadecimal characters in their (SHA-1) hashes. > + echo brocdnra >submodule/c && > + git -C submodule commit -am "update c" && > + echo brigddsv >submodule/c && > + git -C submodule commit -am "update c again" && > + > + cat >expect <<-\EOF && > + 100644 6da7 0 .gitmodules > + 100644 7898 0 a > + 100644 6178 0 b/b > + 100644 dead9 0 submodule/c > + EOF This test though will break, as you can see with: GIT_TEST_DEFAULT_HASH=sha256 ./t3007-ls-files-recurse-submodules.sh So you'll need at least something like: diff --git a/t/t3007-ls-files-recurse-submodules.sh b/t/t3007-ls-files-recurse-submodules.sh index 3d2da360d17..0fe69da8dcf 100755 --- a/t/t3007-ls-files-recurse-submodules.sh +++ b/t/t3007-ls-files-recurse-submodules.sh @@ -42,10 +42,10 @@ test_expect_success '--stage' ' echo brigddsv >submodule/c && git -C submodule commit -am "update c again" && - cat >expect <<-\EOF && - 100644 6da7 0 .gitmodules - 100644 7898 0 a - 100644 6178 0 b/b + cat >expect <<-EOF && + 100644 $(git rev-parse --short=4 HEAD:.gitmodules) 0 .gitmodules + 100644 $(git rev-parse --short=4 HEAD:a) 0 a + 100644 $(git rev-parse --short=4 HEAD:b/b) 0 b/b 100644 dead9 0 submodule/c EOF But then the problem is that one is dead9 and the other dead6, I was just trying to find 4-char prefixes. But having indulged in all that, I'm now entirely confused about why any of this needs to be tested here. You're adding --stage, which will give us --stage-y output, and it was previously incompatible with --recurse-submodules. Having the two combine is good! But why do we need to test the OID abbreviation at all, isn't that a bit too much paranoia? Isn't it sufficient just do: opts="--stage --abbrev=4" && git -C submodule ls-files $opts >expect && git ls-files --recurse-submodules $opts --stage >raw && grep submodule raw >actual && test_cmp expect actual Or well, then the path won't be the same, but I think you get the idea. I.e. don't we just want to test that the submodule is indeed included here, not that some particular feature works in combination with it. Supposing that repo_find_unique_abbrev() won't work might be a bit too much paranoia, and I'm more test-happy than most :) I'd think that if we should test anything it would be more meaningful to e.g. test the sort order of the returned entries. Your test case won't disambiguate between index entries being returned in sort order v.s. just "submodules at the end". Since "s" sorts after 0, a and b. Presumably it does the former, but I'd think distinguishing those would be one meaningful test of actual --recurse-submodules --stage functionality.