On Sat, Jan 13, 2024 at 07:35:44PM +0100, SZEDER Gábor wrote: > On Wed, Jan 03, 2024 at 10:08:18AM -0800, Junio C Hamano wrote: > > Taylor Blau <me@xxxxxxxxxxxx> writes: > > > > >> * tb/path-filter-fix (2023-10-18) 17 commits > > >> - bloom: introduce `deinit_bloom_filters()` > > >> ... > > >> - t/t4216-log-bloom.sh: harden `test_bloom_filters_not_used()` > > >> > > >> The Bloom filter used for path limited history traversal was broken > > >> on systems whose "char" is unsigned; update the implementation and > > >> bump the format version to 2. > > >> > > >> Expecting a reroll. > > >> cf. <20231023202212.GA5470@xxxxxxxxxx> > > >> source: <cover.1697653929.git.me@xxxxxxxxxxxx> > > > > > > I was confused by this one, since I couldn't figure out which tests > > > Gábor was referring to here. I responded in [1], but haven't heard back > > > since the end of October. > > > ... > > > [1]: https://lore.kernel.org/git/ZUARCJ1MmqgXfS4i@nand.local/ > > I keep referring to the test in: > > https://public-inbox.org/git/20230830200218.GA5147@xxxxxxxxxx/ > > which, rather disappointingly, is still the only test out there > exercising the interaction between split commit graphs and different > modified path Bloom filter versions. Note that in that message I > mentioned that merging layers with differenet Bloom filter versions > seemed to work, but that, alas, is no longer the case, because it's > now broken in Taylor's recent iterations of the patch series. > > At the risk of sounding like a broken record: the interaction of split > commit graphs and different Bloom filter versions should be thoroughly > tested. On a related note, if current git (I tried current master and v2.43.0) encounters a commit graph layer containing v2 Bloom filters (created by current seen) while writing a new commit graph, then it segfaults dereferencing a NULL 'settings' pointer in get_or_compute_bloom_filter(). The test below demonstrates this, but it's quite hacky using two different git versions: it has to be run by an old git version not yet supporting v2 Bloom filters, and a new git version already supporting them should be installed at /tmp/git-new/. diff --git a/t/t4216-log-bloom.sh b/t/t4216-log-bloom.sh index 2ba0324a69..0464dd68d5 100755 --- a/t/t4216-log-bloom.sh +++ b/t/t4216-log-bloom.sh @@ -454,4 +454,33 @@ test_expect_success 'Bloom reader notices out-of-order index offsets' ' test_cmp expect.err err ' +CENT=$(printf "\302\242") +test_expect_success 'split commit graph vs changed paths Bloom filter v2 vs old git' ' + git init split-v2-old && + ( + cd split-v2-old && + git commit --allow-empty -m "Bloom filters are written but still ignored for root commits :(" && + for i in 1 2 3 + do + echo $i >$CENT && + git add $CENT && + git commit -m "$i" || return 1 + done && + git log --oneline -- $CENT >expect && + + # Here we write a commit graph layer containing v2 changed + # path Bloom filters using a git binary built from current + # 'seen' branch. + git rev-parse HEAD^ | + /tmp/git-new/bin/git -c commitgraph.changedPathsVersion=2 \ + commit-graph write --stdin-commits --changed-paths --split && + + # This is current master, and segfaults. + git commit-graph write --reachable --changed-paths && + + git log --oneline -- $CENT >actual && + test_cmp expect actual + ) +' + test_done (gdb) r Starting program: /home/szeder/src/git/git commit-graph write --reachable --changed-paths [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Program received signal SIGSEGV, Segmentation fault. 0x00005555556b8714 in get_or_compute_bloom_filter (r=0x555555a0e5a0 <the_repo>, c=0x555555a1cdd0, compute_if_not_present=1, settings=0x0, computed=0x7fffffffd634) at bloom.c:253 253 diffopt.max_changes = settings->max_changed_paths; (gdb) l 248 return NULL; 249 250 repo_diff_setup(r, &diffopt); 251 diffopt.flags.recursive = 1; 252 diffopt.detect_rename = 0; 253 diffopt.max_changes = settings->max_changed_paths; 254 diff_setup_done(&diffopt); 255 256 /* ensure commit is parsed so we have parent information */ 257 repo_parse_commit(r, c); (gdb) bt #0 0x00005555556b8714 in get_or_compute_bloom_filter ( r=0x555555a0e5a0 <the_repo>, c=0x555555a1cdd0, compute_if_not_present=1, settings=0x0, computed=0x7fffffffd634) at bloom.c:253 #1 0x00005555556d16d5 in compute_bloom_filters (ctx=0x555555a1c200) at commit-graph.c:1783 #2 0x00005555556d4329 in write_commit_graph (odb=0x555555a19210, pack_indexes=0x0, commits=0x7fffffffd7c0, flags=(COMMIT_GRAPH_WRITE_PROGRESS | COMMIT_GRAPH_WRITE_BLOOM_FILTERS), opts=0x5555559ef8d0 <write_opts>) at commit-graph.c:2603 #3 0x00005555556d19ab in write_commit_graph_reachable (odb=0x555555a19210, flags=(COMMIT_GRAPH_WRITE_PROGRESS | COMMIT_GRAPH_WRITE_BLOOM_FILTERS), opts=0x5555559ef8d0 <write_opts>) at commit-graph.c:1849 #4 0x00005555555a63c2 in graph_write (argc=0, argv=0x7fffffffde20, prefix=0x0) at builtin/commit-graph.c:294 #5 0x00005555555a66f4 in cmd_commit_graph (argc=3, argv=0x7fffffffde20, prefix=0x0) at builtin/commit-graph.c:353 #6 0x0000555555575b43 in run_builtin (p=0x5555559da260 <commands+576>, argc=4, argv=0x7fffffffde20) at git.c:469 #7 0x0000555555575fcb in handle_builtin (argc=4, argv=0x7fffffffde20) at git.c:724 #8 0x0000555555576272 in run_argv (argcp=0x7fffffffdc7c, argv=0x7fffffffdc70) at git.c:788 #9 0x000055555557685d in cmd_main (argc=4, argv=0x7fffffffde20) at git.c:923 #10 0x000055555568ad55 in main (argc=5, argv=0x7fffffffde18) at common-main.c:62