Hey! The commit graph feature brought in a lot of performance improvements across multiple commands. However, file based history continues to be a performance pain point, especially in large repositories. Adopting changed path bloom filters has been discussed on the list before, and a prototype version was worked on by SZEDER Gábor, Jonathan Tan and Dr. Derrick Stolee [1]. This series is based on Dr. Stolee's approach [2] and presents an updated and more polished RFC version of the feature. Performance Gains: We tested the performance of git log -- path on the git repo, the linux repo and some internal large repos, with a variety of paths of varying depths. On the git and linux repos: We observed a 2x to 5x speed up. On a large internal repo with files seated 6-10 levels deep in the tree: We observed 10x to 20x speed ups, with some paths going up to 28 times faster. Future Work (not included in the scope of this series): 1. Supporting multiple path based revision walk 2. Adopting it in git blame logic. 3. Interactions with line log git log -L This series is intended to start the conversation and many of the commit messages include specific call outs for suggestions and thoughts. Cheers! Garima Singh [1] https://lore.kernel.org/git/20181009193445.21908-1-szeder.dev@xxxxxxxxx/ [2] https://lore.kernel.org/git/61559c5b-546e-d61b-d2e1-68de692f5972@xxxxxxxxx/ Garima Singh (9): commit-graph: add --changed-paths option to write commit-graph: write changed paths bloom filters commit-graph: use MAX_NUM_CHUNKS commit-graph: document bloom filter format commit-graph: write changed path bloom filters to commit-graph file. commit-graph: test commit-graph write --changed-paths commit-graph: reuse existing bloom filters during write. revision.c: use bloom filters to speed up path based revision walks commit-graph: add GIT_TEST_COMMIT_GRAPH_BLOOM_FILTERS test flag Documentation/git-commit-graph.txt | 5 + .../technical/commit-graph-format.txt | 17 ++ Makefile | 1 + bloom.c | 257 +++++++++++++++++ bloom.h | 51 ++++ builtin/commit-graph.c | 9 +- ci/run-build-and-tests.sh | 1 + commit-graph.c | 116 +++++++- commit-graph.h | 9 +- revision.c | 67 ++++- revision.h | 5 + t/README | 3 + t/helper/test-read-graph.c | 4 + t/t4216-log-bloom.sh | 77 ++++++ t/t5318-commit-graph.sh | 2 + t/t5324-split-commit-graph.sh | 1 + t/t5325-commit-graph-bloom.sh | 258 ++++++++++++++++++ 17 files changed, 875 insertions(+), 8 deletions(-) create mode 100644 bloom.c create mode 100644 bloom.h create mode 100755 t/t4216-log-bloom.sh create mode 100755 t/t5325-commit-graph-bloom.sh base-commit: b02fd2accad4d48078671adf38fe5b5976d77304 Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-497%2Fgarimasi514%2FcoreGit-bloomFilters-v1 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-497/garimasi514/coreGit-bloomFilters-v1 Pull-Request: https://github.com/gitgitgadget/git/pull/497 -- gitgitgadget