Re: [PATCH v4 07/14] bloom: split 'get_bloom_filter()' in two

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

On Sat, 5 Sep 2020 at 19:38, Taylor Blau <me@xxxxxxxxxxxx> wrote:
> On Sat, Sep 05, 2020 at 07:22:08PM +0200, Jakub Narębski wrote:
> > Taylor Blau <me@xxxxxxxxxxxx> writes:
> >
> > How do you distinguish between no changed paths stored because there
> > were no changes (which should not count as *_found_large), and no
> > changed paths stored because there were too many changes?  If I remember
> > it correctly in current implementation both are represented as
> > zero-length filter (no changed paths could have been represented as all
> > zeros filter, too many changed paths could have been represented as all
> > ones filter).
>
> Right, once we get handed back a filter from
> 'get_or_compute_bloom_filter()', we can't distinguish between (a) a
> commit with too many changes to store in a single Bloom filter, and (b)
> a commit with no changes at all.

We could change how we store either no-changes Bloom filter (as all
zeros minimal size filter), or too-many-changes Bloom filter (as all
ones, i.e. max unsigned value, minimal size filter). This change would
not require to change any user of Bloom filter.

> It's unfortunate that callers can't pick between the two, but this
> implementation is actually an improvement on the status-quo! Why?
> Because right now we'll see an "empty" Bloom filter and recompute it
> because it's "missing", only to discover that it has no changes.
>
> With this patch, we'll say "this filter looks too large", and stop
> computing it, because we have already gone through the effort to compute
> it once (and marked it in the BFXL chunk).

Can we use this when computing trace2 values?

[...]
> > This is a minor issue, though.
>
> Thanks for raising it. I don't think that this is a show-stopper for
> this series.

I agree.

Best,
-- 
Jakub Narębski




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux