On Wed, Apr 03, 2019 at 04:32:30PM +0700, Duy Nguyen wrote: > That might explain why I could not see significant gain when blaming > linux.git's MAINTAINERS file (0.5s was shaved out of 13s) even though > the number of objects read was cut by half (8424 vs 15083). I did a few timings, too, and managed to come up with similar improvements (only a small fraction, and only for large files). I think the main thing is simply that loading the blob from the object database is a fraction of the total work done. We still have to actually diff the blobs, which is at least as expensive as loading them from disk. We also have to load commits and trees from disk as we traverse. Enabling the commit-graph would shrink that portion (and make improvements in the blob loading proportionally more impressive). All that said, this seems like an easy and obvious win, and worth doing. 0.5s is still something. I suspect we could do even better by storing and reusing not just the original blob between diffs, but the intermediate diff state (i.e., the hashes produced by xdl_prepare(), which should be usable between multiple diffs). That's quite a bit more complex, though, and I imagine would require some surgery to xdiff. -Peff