Re: [PATCH] blame.c: don't drop origin blobs as eagerly

Duy Nguyen <pclouds@xxxxxxxxx> · Wed, 3 Apr 2019 19:06:02 +0700

On Wed, Apr 3, 2019 at 6:36 PM Jeff King <peff@xxxxxxxx> wrote:
> I suspect we could do even better by storing and reusing not just the
> original blob between diffs, but the intermediate diff state (i.e., the
> hashes produced by xdl_prepare(), which should be usable between
> multiple diffs). That's quite a bit more complex, though, and I imagine
> would require some surgery to xdiff.

Amazing. xdl_prepare_ctx and xdl_hash_record (called inside
xdl_prepare_ctx) account for 36% according to 'perf report'. Please
tell me you just did not get this on your first guess.

I tracked and dumped all the hashes that are sent to xdl_prepare() and
it looks like the amount of duplicates is quite high. There are only
about 1000 one-time hashes out of 7000 (didn't really draw a histogram
to examine closer). So yeah this looks really promising, assuming
somebody is going to do something about it.
-- 
Duy