On Wed, Apr 3, 2019 at 2:45 PM Junio C Hamano <gitster@xxxxxxxxx> wrote: > > David Kastrup <dak@xxxxxxx> writes: > > > When a parent blob already has chunks queued up for blaming, dropping > > the blob at the end of one blame step will cause it to get reloaded > > right away, doubling the amount of I/O and unpacking when processing a > > linear history. > > > > Keeping such parent blobs in memory seems like a reasonable optimization > > that should incur additional memory pressure mostly when processing the > > merges from old branches. > > Thanks for finding an age-old one that dates back to 7c3c7962 > ("blame: drop blob data after passing blame to the parent", > 2007-12-11). > > Interestingly, the said commit claims: > > When passing blame from a parent to its parent (i.e. the > grandparent), the blob data for the parent may need to be read > again, but it should be relatively cheap, thanks to delta-base > cache. > > but perhaps you found a case where the delta-base cache is not all > that effective in the benchmark? Interesting. For some reason I keep remembering the delta-base cache is for caching base objects, not all packed objects. That might explain why I could not see significant gain when blaming linux.git's MAINTAINERS file (0.5s was shaved out of 13s) even though the number of objects read was cut by half (8424 vs 15083). I just tried again. The number of actual pack reading is slightly reduced with the patch (498260 vs 502140). Not by a large margin. But I imagine if the cache is under pressure (MAINTAINERS file is quite small, 426k), we may get more eviction and misses from delta-base cache. It might still help when we need to read loose objects though. But I guess this matters even less. And I don't know how lazy clones behave in this case. If we get new objects and store as loose, then it helps a bit more. > Will queue. Thanks. > > > > > > > > Signed-off-by: David Kastrup <dak@xxxxxxx> > > --- > > blame.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/blame.c b/blame.c > > index 5c07dec190..c11c516921 100644 > > --- a/blame.c > > +++ b/blame.c > > @@ -1562,7 +1562,8 @@ static void pass_blame(struct blame_scoreboard *sb, struct blame_origin *origin, > > } > > for (i = 0; i < num_sg; i++) { > > if (sg_origin[i]) { > > - drop_origin_blob(sg_origin[i]); > > + if (!sg_origin[i]->suspects) > > + drop_origin_blob(sg_origin[i]); > > blame_origin_decref(sg_origin[i]); > > } > > } -- Duy