Taylor Blau <me@xxxxxxxxxxxx> writes: > This is a little beyond the scope of your series, but since we're > changing the on-disk format here a little bit, I think that it might be > worth it to consider whether there are any other changes that we'd like > to perform at the same time. > > One that comes to mind is serializing the `max_changed_paths` value of > the Bloom filter settings, which is currently hard-coded as a constant, > c.f. 97ffa4fab50 (commit-graph.c: store maximum changed paths, > 2020-09-17). > > We always assume that the value there is 512, or the environment > variable GIT_TEST_BLOOM_SETTINGS_MAX_CHANGED_PATHS, if it is set. But it > might be nice to write it to disk, since it would allow us to do > something like: That's true...I do think it makes sense to either have both bits_per_entry and max_changed_paths (because if we are reusing Bloom filters when writing, presumably we would want to make sure that the existing ones were generated using the same settings), or have neither (since we don't need them for reading). Having said that, I am inclined to not change this, so that the offset calculations are the same for both versions (e.g. in the test tool too), and as far as I know, we haven't had problems with this. But I can change it if people want. > > + in Probabilistic Verification". Version 1 Bloom filters have a bug that appears > > + when char is signed and the repository has path names that have characters >= > > + 0x80; Git supports reading and writing them, but this ability will be removed > > + in a future version of Git. > > Makes sense. > > Thanks, > Taylor Thanks for taking a look.