This is following the discussion about adding a new changed path filter version due to the current implementation of murmur3 not matching the algorithm. [1] SZEDER Gábor suggested [2] that we change the revision walk to read changed path filters also for root commits, but I don't think that's possible - we have to tie reading changed path filters to when we read trees, and right now, we don't seem to read trees when evaluating root commits (rev_compare_tree() in revision.c is in the only code path that uses changed path filters, and it itself is only called per-parent and thus not called for root commits). The alternative is to not generate changed path filters for root commits (or what I did in this patch, which is to generate an all-1 filter), which seems reasonable to me. Is this a good idea? If yes, there is the follow-up question of how to report it in traces. I don't know off-hand if it's better to reuse the "large" statistic, since what we output is the same (all 1s), or if we should make a new one. Presumably what we need to weigh is the clarity versus the migration costs, but I don't know how to weigh it. If people are generally in agreement, I can send an updated patch that does not goto into an "else" block :) and also with updated tests (t0095 needs to check a non-root commit, and t4216 needs to be updated with whatever statistics we decide to use). [1] https://lore.kernel.org/git/20230826150610.GA1928@xxxxxxxxxx/ [2] https://lore.kernel.org/git/20230830200218.GA5147@xxxxxxxxxx/ --- bloom.c | 3 ++- t/t0095-bloom.sh | 2 +- t/t4216-log-bloom.sh | 12 +++--------- 3 files changed, 6 insertions(+), 11 deletions(-) diff --git a/bloom.c b/bloom.c index aef6b5fea2..b21130b236 100644 --- a/bloom.c +++ b/bloom.c @@ -226,7 +226,7 @@ struct bloom_filter *get_or_compute_bloom_filter(struct repository *r, if (c->parents) diff_tree_oid(&c->parents->item->object.oid, &c->object.oid, "", &diffopt); else - diff_tree_oid(NULL, &c->object.oid, "", &diffopt); + goto large; diffcore_std(&diffopt); if (diff_queued_diff.nr <= settings->max_changed_paths) { @@ -292,6 +292,7 @@ struct bloom_filter *get_or_compute_bloom_filter(struct repository *r, } else { for (i = 0; i < diff_queued_diff.nr; i++) diff_free_filepair(diff_queued_diff.queue[i]); +large: init_truncated_large_filter(filter); if (computed) diff --git a/t/t0095-bloom.sh b/t/t0095-bloom.sh index b567383eb8..02a0b41026 100755 --- a/t/t0095-bloom.sh +++ b/t/t0095-bloom.sh @@ -74,7 +74,7 @@ test_expect_success !SANITIZE_LEAK 'get bloom filters for commit with no changes git commit --allow-empty -m "c0" && cat >expect <<-\EOF && Filter_Length:1 - Filter_Data:00| + Filter_Data:ff| EOF test-tool bloom get_filter_for_commit "$(git rev-parse HEAD)" >actual && test_cmp expect actual diff --git a/t/t4216-log-bloom.sh b/t/t4216-log-bloom.sh index fa9d32facf..d14fe93fb1 100755 --- a/t/t4216-log-bloom.sh +++ b/t/t4216-log-bloom.sh @@ -283,7 +283,6 @@ test_expect_success 'correctly report changes over limit' ' git commit-graph write --reachable --changed-paths && test_max_changed_paths 11 trace-update && test_filter_computed 2 trace-update && - test_filter_trunc_large 0 trace-update && for path in $(git ls-tree -r --name-only HEAD) do @@ -306,9 +305,7 @@ test_expect_success 'correctly report commits with no changed paths' ' GIT_TRACE2_EVENT="$(pwd)/trace.event" \ git commit-graph write --reachable --changed-paths && test_filter_computed 1 trace.event && - test_filter_not_computed 0 trace.event && - test_filter_trunc_empty 1 trace.event && - test_filter_trunc_large 0 trace.event + test_filter_not_computed 0 trace.event ) ' @@ -363,8 +360,7 @@ test_expect_success '--max-new-filters overrides configuration' ' --max-new-filters=1 && test_filter_computed 1 trace.event && test_filter_not_computed 1 trace.event && - test_filter_trunc_empty 0 trace.event && - test_filter_trunc_large 0 trace.event + test_filter_trunc_empty 0 trace.event ) ' @@ -386,9 +382,7 @@ test_expect_success 'Bloom generation backfills empty commits' ' git commit-graph write --reachable \ --changed-paths --max-new-filters=2 && test_filter_computed 2 trace.event && - test_filter_not_computed 4 trace.event && - test_filter_trunc_empty 2 trace.event && - test_filter_trunc_large 0 trace.event || return 1 + test_filter_not_computed 4 trace.event || return 1 done && # Finally, make sure that once all commits have filters, that -- 2.42.0.283.g2d96d420d3-goog