Re: What's cooking in git.git (Jan 2024, #01; Tue, 2)

Taylor Blau <me@xxxxxxxxxxxx> · Tue, 16 Jan 2024 18:31:19 -0500

On Tue, Jan 16, 2024 at 02:45:41PM -0800, Junio C Hamano wrote:
> Taylor Blau <me@xxxxxxxxxxxx> writes:
>
> > OK, everything seems fine thus far, until we inspect the value of
> > g->bloom_filter_settings, which is NULL, becuase of this hunk from
> > commit-graph.c::graph_read_bloom_data():
> >
> >     if (hash_version != 1)
> >       return 0;
> >
> > which terminates the function before we assign g->bloom_filter_settings
> > for the existing (written with v2 Bloom filters) graph layer.
> >
> > I don't think that there is a way to fix this in a backwards compatible
> > way, but I'm comfortable with that in this instance since we don't
> > expect users to upgrading to v2 Bloom filters and then writing new graph
> > layers using a non-v2 compatible version of Git.
>
> A big red button solution to avoid this would be to uprev the
> repository format version once you start writing v2 Bloom filters
> anywhere in the layers.  That way, existing Git clients would not be
> able to touch it.  I do not know if the cure is more severe than the
> disease in that case, though.

I tend to think that in this case the cure is probably worse than the
disease. I expect it to be extremely rare that a user would upgrade to a
modern version of Git, write commit-graphs, then downgrade, and try to
write more commit-graphs.

> In any case, at least, we should be able to prepare the code that we
> teach to grok v2 today so that they do not trigger the same segfault
> when they see a commit graph layer containing v3 Bloom filters (or
> later).  Then we won't have to have the same conversation when we
> somehow need to update Bloom filters again.

This series should accomplish that by loading the Bloom chunk
unconditionally, and only reading its filters when they match the
given hash_version.

Thanks,
Taylor