Taylor Blau <me@xxxxxxxxxxxx> writes: > Sorry for not getting to this sooner. I didn't notice anything during my > review, but I think there may be a bug here. > > Suppose that we have an existing commit-graph with v1 Bloom filters. If > we then try to rewrite that commit-graph using v2 Bloom filters, we > *should* attempt to recompute the filter from scratch. But AFAICT, that > isn't what happens. > ... > If I'm parsing this all correctly, Git used the v1 filter corresponding > to that commit, and did not recompute a new one. > > I think that this could lead to incorrect results if you use this to > masquerade a v1 Bloom filter as a v2 one. That indeed is very bad. How hard it would be to construct a test case that fails if filter computed as v1 is marketed as v2? A test may be far easier to construct if it does not have to be end-to-end (e.g. instrument the codepath you followed through with the debugger with trace2 annotations and see the control takes the right branches by reading the log). > So I think we want to be more careful about when we load the existing > Bloom data or not. I think we probably want to load it in all cases, but > only read it when we have compatible Bloom settings. That should stop us > from exhibiting this kind of bug, but it also gives us a handle on > existing Bloom data if we wanted to copy forward existing results where > we can. > > If all of this tracks, I think that there is a gap in our test coverage > that didn't catch this earlier. Yeah, thanks for raising a concern. Jonathan?