David Kastrup <dak@xxxxxxx> writes: > When a parent blob already has chunks queued up for blaming, dropping > the blob at the end of one blame step will cause it to get reloaded > right away, doubling the amount of I/O and unpacking when processing a > linear history. > > Keeping such parent blobs in memory seems like a reasonable optimization > that should incur additional memory pressure mostly when processing the > merges from old branches. Thanks for finding an age-old one that dates back to 7c3c7962 ("blame: drop blob data after passing blame to the parent", 2007-12-11). Interestingly, the said commit claims: When passing blame from a parent to its parent (i.e. the grandparent), the blob data for the parent may need to be read again, but it should be relatively cheap, thanks to delta-base cache. but perhaps you found a case where the delta-base cache is not all that effective in the benchmark? Will queue. Thanks. > > Signed-off-by: David Kastrup <dak@xxxxxxx> > --- > blame.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/blame.c b/blame.c > index 5c07dec190..c11c516921 100644 > --- a/blame.c > +++ b/blame.c > @@ -1562,7 +1562,8 @@ static void pass_blame(struct blame_scoreboard *sb, struct blame_origin *origin, > } > for (i = 0; i < num_sg; i++) { > if (sg_origin[i]) { > - drop_origin_blob(sg_origin[i]); > + if (!sg_origin[i]->suspects) > + drop_origin_blob(sg_origin[i]); > blame_origin_decref(sg_origin[i]); > } > }