Re: [PATCH] blame.c: don't drop origin blobs as eagerly

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Apr 3, 2019 at 2:45 PM Junio C Hamano <gitster@xxxxxxxxx> wrote:
>
> David Kastrup <dak@xxxxxxx> writes:
>
> > When a parent blob already has chunks queued up for blaming, dropping
> > the blob at the end of one blame step will cause it to get reloaded
> > right away, doubling the amount of I/O and unpacking when processing a
> > linear history.
> >
> > Keeping such parent blobs in memory seems like a reasonable optimization
> > that should incur additional memory pressure mostly when processing the
> > merges from old branches.
>
> Thanks for finding an age-old one that dates back to 7c3c7962
> ("blame: drop blob data after passing blame to the parent",
> 2007-12-11).
>
> Interestingly, the said commit claims:
>
>     When passing blame from a parent to its parent (i.e. the
>     grandparent), the blob data for the parent may need to be read
>     again, but it should be relatively cheap, thanks to delta-base
>     cache.
>
> but perhaps you found a case where the delta-base cache is not all
> that effective in the benchmark?

Interesting. For some reason I keep remembering the delta-base cache
is for caching base objects, not all packed objects.

That might explain why I could not see significant gain when blaming
linux.git's MAINTAINERS file (0.5s was shaved out of 13s) even though
the number of objects read was cut by half (8424 vs 15083).

I just tried again. The number of actual pack reading is slightly
reduced with the patch (498260 vs 502140). Not by a large margin. But
I imagine if the cache is under pressure (MAINTAINERS file is quite
small, 426k), we may get more eviction and misses from delta-base
cache.

It might still help when we need to read loose objects though. But I
guess this matters even less.

And I don't know how lazy clones behave in this case. If we get new
objects and store as loose, then it helps a bit more.

> Will queue.  Thanks.
>
>
>
>
> >
> > Signed-off-by: David Kastrup <dak@xxxxxxx>
> > ---
> >  blame.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/blame.c b/blame.c
> > index 5c07dec190..c11c516921 100644
> > --- a/blame.c
> > +++ b/blame.c
> > @@ -1562,7 +1562,8 @@ static void pass_blame(struct blame_scoreboard *sb, struct blame_origin *origin,
> >       }
> >       for (i = 0; i < num_sg; i++) {
> >               if (sg_origin[i]) {
> > -                     drop_origin_blob(sg_origin[i]);
> > +                     if (!sg_origin[i]->suspects)
> > +                             drop_origin_blob(sg_origin[i]);
> >                       blame_origin_decref(sg_origin[i]);
> >               }
> >       }



-- 
Duy



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux