Re: [PATCH 0/9] [RFC] Changed Paths Bloom Filters

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Dec 20, 2019 at 10:05:11PM +0000, Garima Singh via GitGitGadget wrote:

> Adopting changed path bloom filters has been discussed on the list before,
> and a prototype version was worked on by SZEDER Gábor, Jonathan Tan and Dr.
> Derrick Stolee [1]. This series is based on Dr. Stolee's approach [2] and
> presents an updated and more polished RFC version of the feature.

Great to see progress here. I probably won't have time to review this
carefully before the new year, but I did notice some low-hanging fruit
on the generation side.

So here are a few patches to reduce the CPU and memory usage. They could
be squashed in at the appropriate spots, or perhaps taken as inspiration
if there are better solutions (especially for the first one).

I think we could go further still, by actually doing a non-recursive
diff_tree_oid(), and then recursing into sub-trees ourselves. That would
save us having to split apart each path to add the leading paths to the
hashmap (most of which will be duplicates if the commit touched "a/b/c"
and "a/b/d", etc). I doubt it would be that huge a speedup though. We
have to keep a list of the touched paths anyway (since the bloom key
parameters depend on the number of entries), and most of the time is
almost certainly spent inflating the trees in the first place. However
it might be easier to follow the code, and it would make it simpler to
stop traversing at the 512-entry limit, rather than generating a huge
diff only to throw it away.

  [1/3]: commit-graph: examine changed-path objects in pack order
  [2/3]: commit-graph: free large diffs, too
  [3/3]: commit-graph: stop using full rev_info for diffs

 bloom.c        | 18 +++++++++---------
 commit-graph.c | 34 +++++++++++++++++++++++++++++++++-
 2 files changed, 42 insertions(+), 10 deletions(-)

-Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux