On Sat, Sep 05, 2020 at 07:50:01PM +0200, Jakub Narębski wrote: > Hello, > > On Sat, 5 Sep 2020 at 19:38, Taylor Blau <me@xxxxxxxxxxxx> wrote: > > Right, once we get handed back a filter from > > 'get_or_compute_bloom_filter()', we can't distinguish between (a) a > > commit with too many changes to store in a single Bloom filter, and (b) > > a commit with no changes at all. > > We could change how we store either no-changes Bloom filter (as all > zeros minimal size filter), or too-many-changes Bloom filter (as all > ones, i.e. max unsigned value, minimal size filter). This change would > not require to change any user of Bloom filter. I don't think that's true. Say that we changed the empty Bloom filter to be encoded as only having the most-significant bit set. First, we'd have to write a Bloom filter where we didn't have to before. But the real issue is that commit-graph files generated with new clients would suddenly be unreadable by old clients. > > > It's unfortunate that callers can't pick between the two, but this > > implementation is actually an improvement on the status-quo! Why? > > Because right now we'll see an "empty" Bloom filter and recompute it > > because it's "missing", only to discover that it has no changes. > > > > With this patch, we'll say "this filter looks too large", and stop > > computing it, because we have already gone through the effort to compute > > it once (and marked it in the BFXL chunk). > > Can we use this when computing trace2 values? We could, but I don't think it's absolutely necessary. The test coverage in t4216 gives us enough confidence already. > [...] > > > This is a minor issue, though. > > > > Thanks for raising it. I don't think that this is a show-stopper for > > this series. > > I agree. Thanks for your input! > Best, > -- > Jakub Narębski Thanks, Taylor