SZEDER Gábor <szeder.dev@xxxxxxxxx> writes: > I'd prefer to leave the test cases unchanged, and make the revision > walking machinery look at Bloom filters even for root commits, because > this is an annoying and recurring testing issue. I remember it > annoyed me back then, when I wanted to reproduce a couple of bugs that > I knew were there, but my initial test cases didn't fail because the > Bloom filter of the root commit was ignored; Derrick overlooked this > in b16a8277644, you overlooked it now, and none of the reviewers then > and now caught it, either. I agree that making the revwalk look at Bloom filters of root commits is an urgent matter for the reasons you describe (it's something easily missed that slows down future development and/or makes future development error-prone), but so is moving to a correct murmur3 implementation, I think, and one shouldn't stop the other. There could be an argument that because the revwalk doesn't look at root commit Bloom filters, any development on a new Bloom filter hash version is suspect, but I don't think that has to be completely true. > > My original design (up to patch 7 in this patch set) defends against > > this by taking the very first version detected and rejecting every > > other version, and Taylor's subsequent design reads every version, but > > annotates filters with its version. So I think we're covered. > > In the meantime I adapted the test cases from the above linked message > to write commit-graph layers with different Bloom filter versions, and > it does fail, because commits from the bottom commit-graph layer are > omitted from the revision walk's output. And the test case doesn't > even need a middle layer without modified path Bloom filters to "hide" > the different version in the bottom layer. Merging the layers seems > to work, though. For what it's worth, your test case below (with test_expect_success instead of test_expect_failure) passes with my original design. With the full patch set, it does fail, but for what it's worth, I did spot this, providing an incomplete solution [1] and then a complete one [2]. Your test case passes if I also include the following: diff --git a/bloom.c b/bloom.c index ff131893cd..1bafd62a4e 100644 --- a/bloom.c +++ b/bloom.c @@ -344,6 +344,10 @@ struct bloom_filter *get_bloom_filter(struct repository *r, struct commit *c) prepare_repo_settings(r); hash_version = r->settings.commit_graph_changed_paths_version; + if (hash_version == -1) { + struct bloom_filter_settings *s = get_bloom_filter_settings(r); + hash_version = (s && s->hash_version == 2) ? 2 : 1; + } if (!(hash_version == -1 || hash_version == filter->version)) return NULL; /* unusable filter */ [1] https://lore.kernel.org/git/20230824222051.2320003-1-jonathantanmy@xxxxxxxxxx/ [2] https://lore.kernel.org/git/20230829220432.558674-1-jonathantanmy@xxxxxxxxxx/ > Besides fixing this issue, I think that the interaction of different > Bloom filter versions and split commit-graphs needs to be thoroughly > covered with test cases and discussed in the commit messages before > this series could be considered good for merging. Regarding commit messages, I can see that in "commit-graph: new filter ver. that fixes murmur3", I could add "(the first version read if there are multiple versions of changed path filters in the repository)" after "automatically determined from the version of the existing changed path filters in the repository". Taylor's patches already inherently cover multiple versions, I think, since Bloom filters are annotated with their versions, individually. Regarding tests, yes, we could add the one you provided. If you (or anyone else) can spot anything else that should be added, please let us know.