Re: [PATCH v4 00/15] Changed Paths Bloom Filters

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Stolee,

On Wed, Apr 08, 2020 at 11:51:14AM -0400, Derrick Stolee wrote:
> On 4/6/2020 12:59 PM, Garima Singh via GitGitGadget wrote:
> > Hey!
> >
> > The commit graph feature brought in a lot of performance improvements across
> > multiple commands. However, file based history continues to be a performance
> > pain point, especially in large repositories.
> >
> > Adopting changed path Bloom filters has been discussed on the list before,
> > and a prototype version was worked on by SZEDER Gábor, Jonathan Tan and Dr.
> > Derrick Stolee [1]. This series is based on Dr. Stolee's proof of concept in
> > [2]
> >
> > With the changes in this series, git users will be able to choose to write
> > Bloom filters to the commit-graph using the following command:
> >
> > 'git commit-graph write --changed-paths'
> >
> > Subsequent 'git log -- path' commands will use these computed Bloom filters
> > to decided which commits are worth exploring further to produce the history
> > of the provided path.
>
> I noticed Jakub was not CC'd on this email. Jakub: do you plan to re-review
> the new version? Or are you satisfied with the resolutions to your comments?
>
> Is anyone else planning to review this series?

I feel horribly that I've had this patch series sitting in my review
backlog for months and haven't gotten to it yet, especially because I
have such an interest in these patches and know that much care was taken
to prepare them.

I read through these patches over some coffee today at a cursory level.
The high-level approach makes sense to me, and the implementation looks
solid. I think that anything that does come up (see below) can be
addressed in 'next' rather than waiting longer on this series.

For what it's worth, I'm planning on starting to test this series in
some of our testing repositories at GitHub, and I'll report back on our
experience with some notes (and patches) should anything come up.

> I'm just wondering when we should take this series to cook in 'next' and
> start building things on top of it, such as "git blame" or "git log -L"
> improvements. While it cooks, any bugs or issues could be resolved with
> patches on top of this version. That would be my preference, anyway.

That would be my preference, too.

I noticed a few small things (mostly a couple of typos and other very
minor details). But, I'd much rather build on top of this series once it
has landed in 'next' than go to a fifth re-roll since there are many
patches involved.

I also noticed that you have already sent some patches in a separate
series that are based on this one, which would apply cleanly if this
series is merged into next.

I figure that this will also be helpful as I send some patches about
extra 'commit-graph write' options out of GitHub's fork, since they will
inevitably create merge conflicts if we both are targeting 'next'. So,
I figure that this approach will ease some maintainer burden ;-).

>
> What do you think, Junio?
>
> Thanks,
> -Stolee

Thanks,
Taylor



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux