If the changed-path Bloom filters are relatively stable, then I propose trying to build upon them as a way to discover any deficiencies. Also, it's good to use them when we can. The goal I set out to achieve was to use Bloom filters as much as possible in git blame. I think I have achieved most of that, except I did not integrate it with the -C mode. In that case, the blob-diff computation takes a majority of the computation time, so short-circuiting the tree diff using Bloom filters. Also, it's just really complicated. If someone else thinks there is an easy win there, then please go ahead and give it a try with the extra logic presented here in PATCH 3. While I was playing around with Bloom filters and git blame, I purposefully got it working with some scenarios but not all. Then I tried to trigger a failing build in the blame tests using GIT_TEST_COMMIT_GRAPH=1 and GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS=1. But the tests all succeeded! Examining the data, I see that the commit-graph didn't have the Bloom filter chunks at all. This is because we are not setting the flag to write them in the right spot. The GIT_TEST_COMMIT_GRAPH=1 variable triggers a commit-graph write during git commit, so we need to update the code there instead of just inspecting the variable in git commit-graph write. (This is PATCH 2.) By updating this variable, I saw some test failures in other tests regarding non-exact pathspecs. I fixed these in PATCH 1 so we keep clean builds. I based this change on [1] but it would apply cleanly (and logically) on gs/commit-graph-path-filter Updates in v2: * Added PATCH 3 to write commit-graph files during 'git merge' if GIT_TEST_COMMIT_GRAPH is enabled. * Updated PATCH 1 with the simplification recommended by Taylor. * Fixed the lower-case "bloom" in the commit message. Thanks, -Stolee [1] https://lore.kernel.org/git/pull.601.v2.git.1586437211842.gitgitgadget@xxxxxxxxx/ Derrick Stolee (4): revision: complicated pathspecs disable filters commit: write commit-graph with Bloom filters commit-graph: write commit-graph in more tests blame: use changed-path Bloom filters blame.c | 139 ++++++++++++++++++++++++++++++++++++++++++++--- blame.h | 6 ++ builtin/blame.c | 10 ++++ builtin/commit.c | 4 +- builtin/merge.c | 7 ++- commit-graph.c | 9 +++ commit-graph.h | 2 + revision.c | 19 ++++++- 8 files changed, 181 insertions(+), 15 deletions(-) base-commit: f4df00a0dd448edce0e854a97f63598fefe27d27 Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-609%2Fderrickstolee%2Fbloom-blame-v2 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-609/derrickstolee/bloom-blame-v2 Pull-Request: https://github.com/gitgitgadget/git/pull/609 Range-diff vs v1: 1: 9cc31c289aa ! 1: adc03eee4ac revision: complicated pathspecs disable filters @@ Commit message path Bloom filters in the test suite. That will be fixed in the next change. + Helped-by: Taylor Blau <me@xxxxxxxxxxxx> Signed-off-by: Derrick Stolee <dstolee@xxxxxxxxxxxxx> ## revision.c ## +@@ revision.c: static void trace2_bloom_filter_statistics_atexit(void) + jw_release(&jw); + } + ++static int forbid_bloom_filters(struct pathspec *spec) ++{ ++ if (spec->has_wildcard) ++ return 1; ++ if (spec->nr > 1) ++ return 1; ++ if (spec->magic & ~PATHSPEC_LITERAL) ++ return 1; ++ if (spec->nr && (spec->items[0].magic & ~PATHSPEC_LITERAL)) ++ return 1; ++ ++ return 0; ++} ++ + static void prepare_to_use_bloom_filter(struct rev_info *revs) + { + struct pathspec_item *pi; @@ revision.c: static void prepare_to_use_bloom_filter(struct rev_info *revs) - if (!revs->commits) - return; + int len; -+ if (revs->prune_data.has_wildcard) -+ return; -+ if (revs->prune_data.nr > 1) -+ return; -+ if (revs->prune_data.magic || -+ (revs->prune_data.nr && -+ revs->prune_data.items[0].magic)) + if (!revs->commits) +- return; + return; + ++ if (forbid_bloom_filters(&revs->prune_data)) ++ return; + repo_parse_commit(revs->repo, revs->commits->item); - if (!revs->repo->objects->commit_graph) 2: bb5ce39d028 < -: ----------- commit: write commit-graph with bloom filters -: ----------- > 2: 7e8f1aed113 commit: write commit-graph with Bloom filters -: ----------- > 3: 824f8ad067b commit-graph: write commit-graph in more tests 3: 431fde68031 = 4: 4ae196d6355 blame: use changed-path Bloom filters -- gitgitgadget