Taylor Blau <me@xxxxxxxxxxxx> writes: > On Thu, Aug 24, 2023 at 04:05:27PM -0700, Jonathan Tan wrote: > > Up to here is fine. > > > > > + hash_version = r->settings.commit_graph_changed_paths_version; > > > > Instead of doing this, do this (well, move the struct declaration to > > the top): > > > > struct bloom_filter_settings *s = get_bloom_filter_settings(r); > > hash_version = s->hash_version == 2 ? 2 : 1; > > Do we need this normalization? We assign s->hash_version in > commit-graph.c::graph_read_bloom_data() by reading it from the start of > the BDAT chunk, so this should only ever be 1 or 2. I'm not sure offhand if we do...I wrote it this way to match fill_bloom_key(), but fill_bloom_key() was written in that way because it was the clearest, not specifically because it needed to normalize. > > > + if (!(hash_version == -1 || hash_version == filter->version)) > > > > No need for the comparison to -1 here. > > I'm not sure I understand your suggestion. When we fetch the filter from > get_or_compute_bloom_filter(), we have filter->version set to the > hash_version from the containing graph's Bloom filter settings. > > So (besides the normalization), I would think that: > > struct bloom_filter_settings *s = get_bloom_filter_settings(r); > struct bloom_filter *f = get_bloom_filter(...); > > assert(s->hash_version == f->version); > > would hold. My mention to avoid the comparison to -1 was just for completeness - since we're normalizing the value of hash_version to 1 or 2, we no longer need to compare it to -1. As for whether s->hash_version is always equal to f->version, I think that it may not be true if for some reason, there are multiple commit graph files on disk, not all with the same Bloom filter version. > I think the check that we want to make is instead: is this Bloom > filter's version (or equivalently, the hash version indicated by that > graph's BDAT chunk) something that we can read? I think it's not "something that we can read" (eventually, we can read all versions, we just treat them differently) but "the version that fill_bloom_key" will use. We don't want this function to produce a Bloom filter of version X and then have the calling code subsequently use it with a Bloom key of version Y. > And I think "what we can > read" here is dictated by the commit_graph_changed_paths_version member > of our repository_settings, no? I don't think commit_graph_changed_paths_version always dictates something - it could be -1 (which you have probably seen, since you check it against -1 in the current version of the patch). One of the points of my suggestion is to avoid this field completely.