Hi, Here are some patches that I have been sitting on while rolling out changed-path Bloom filters at GitHub. I was initially going to send this as four separate topics, but they really make the most sense when sent all together. Here's an overview of what's contained: - The first patch fixes a bug where Bloom filters are only read if the most-recent layer in a commit-graph-chain has Bloom filters. - Patches 2-5 introduce 'commitgraph.readChangedPaths', a new configuration option for debugging changed-path Bloom filters. When disabled, the commit-graph machinery pretends as if there are no BIDX/BDAT chunks. This is useful when testing behavior with/without Bloom filters without having to regenerate the commit graph. - Patches 6-7 introduce a new 'BFXL' chunk which is a bitmap indicating which Bloom filters were too large to compute (ie., had more than 512 changed paths). This is a prerequisite for the final feature, --max-changed-paths. - Patches 8-10 introduces '--max-new-filters <n>', which allows callers to limit the number of Bloom filters that they are willing to compute from scratch when generating commit-graphs with --changed-paths. The BFXL chunk is a prerequisite for '--max-new-filters' because of an unfortunate overloading where filters with (1) no changed paths, (2) too many changed paths, and (3) ones that weren't computed at all are all represented as a length-0 filter by the BIDX chunk. The problem is that in repositories with many commits that have too many changed-paths to store Bloom filters, specifying '--max-new-filters' recomputes those same large filters each commit-graph write, only to throw them away. Knowing which filters are too-large allows us to skip over computing filters we know are a waste. Thanks in advance for your review! Taylor Blau (10): commit-graph: introduce 'get_bloom_filter_settings()' commit-graph: pass a 'struct repository *' in more places t4216: use an '&&'-chain t/helper/test-read-graph.c: prepare repo settings commit-graph: respect 'commitgraph.readChangedPaths' commit-graph.c: sort index into commits list commit-graph: add large-filters bitmap chunk bloom: split 'get_bloom_filter()' in two commit-graph: rename 'split_commit_graph_opts' builtin/commit-graph.c: introduce '--max-new-filters=<n>' Documentation/config.txt | 2 + Documentation/config/commitgraph.txt | 8 + Documentation/git-commit-graph.txt | 4 + .../technical/commit-graph-format.txt | 9 + blame.c | 8 +- bloom.c | 34 ++- bloom.h | 9 +- builtin/commit-graph.c | 61 +++-- commit-graph.c | 214 +++++++++++++----- commit-graph.h | 17 +- fuzz-commit-graph.c | 5 +- line-log.c | 2 +- repo-settings.c | 3 + repository.h | 1 + revision.c | 4 +- t/helper/test-bloom.c | 2 +- t/helper/test-read-graph.c | 3 +- t/t4216-log-bloom.sh | 54 ++++- t/t5324-split-commit-graph.sh | 13 ++ 19 files changed, 357 insertions(+), 96 deletions(-) create mode 100644 Documentation/config/commitgraph.txt -- 2.28.0.rc1.13.ge78abce653