On Fri, Sep 01, 2023 at 01:56:15PM -0700, Jonathan Tan wrote: > > > My original design (up to patch 7 in this patch set) defends against > > > this by taking the very first version detected and rejecting every > > > other version, and Taylor's subsequent design reads every version, but > > > annotates filters with its version. So I think we're covered. > > > > In the meantime I adapted the test cases from the above linked message > > to write commit-graph layers with different Bloom filter versions, and > > it does fail, because commits from the bottom commit-graph layer are > > omitted from the revision walk's output. And the test case doesn't > > even need a middle layer without modified path Bloom filters to "hide" > > the different version in the bottom layer. Merging the layers seems > > to work, though. > > For what it's worth, your test case below (with test_expect_success > instead of test_expect_failure) passes with my original design. With the > full patch set, it does fail, but for what it's worth, I did spot this, > providing an incomplete solution [1] and then a complete one [2]. Your > test case passes if I also include the following: > > diff --git a/bloom.c b/bloom.c > index ff131893cd..1bafd62a4e 100644 > --- a/bloom.c > +++ b/bloom.c > @@ -344,6 +344,10 @@ struct bloom_filter *get_bloom_filter(struct repository *r, struct commit *c) > > prepare_repo_settings(r); > hash_version = r->settings.commit_graph_changed_paths_version; > + if (hash_version == -1) { > + struct bloom_filter_settings *s = get_bloom_filter_settings(r); > + hash_version = (s && s->hash_version == 2) ? 2 : 1; > + } > > if (!(hash_version == -1 || hash_version == filter->version)) > return NULL; /* unusable filter */ > > [1] https://lore.kernel.org/git/20230824222051.2320003-1-jonathantanmy@xxxxxxxxxx/ > [2] https://lore.kernel.org/git/20230829220432.558674-1-jonathantanmy@xxxxxxxxxx/ Hmm. I am confused -- are you saying that this series breaks existing functionality, or merely does not patch an existing breakage? I *think* that it's the latter, since this test case fails identically on master, but I am not sure. If my understanding is correct, I think that patching this would involve annotating each Bloom filter with a pointer to the bloom_filter_settings it was written with, and then using those settings when querying it. Thanks, Taylor