Re: [PATCH 2/2] Implement a simple delta_base cache

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Sat, 17 Mar 2007, Linus Torvalds wrote:
> 
> This trivial 256-entry delta_base cache improves performance for some 
> loads by a factor of 2.5 or so.

Btw, final comment on this issue:

I was initially a bit worried about optimizing for just the "git log" with 
pathspec or "git blame" kind of behaviour, and possibly pessimizing some 
other load.

But the way the caching works, this is likely to be faster (or at least 
not slower) even for something that doesn't ever need the cache (which in 
turn is likely to be because it's a smaller footprint query and only works 
on one version).

Because the way the cache works, it doesn't really do any extra work: it 
basically just delays the "free()" on the buffer we allocated. So for 
really small footprints it just avoids the overhead of free() (let the OS 
reap the pages for it at exit), and for bigger footprints (that end up 
replacing the cache entries) it will just do the same work a bit later.

Because it's a simple direct-mapped cache, the only cost is the (trivial) 
hash of a few instructions, and possibly the slightly bigger D$ footprint. 
I would strongly suspect that even on loads where it doesn't help by 
reusing the cached objects, the delayed free'ing on its own is as likely 
to help as it is to hurt.

So there really shouldn't be any downsides.

Testing on some other loads (for example, drivers/scsi/ has more activity 
than drivers/usb/), the 2x performance win seems to happen for other 
things too. For drivers/scsi, the log generating went down from 3.582s 
(best) to 1.448s.

"git blame Makefile" went from 1.802s to 1.243s (both best-case numbers 
again: a smaller win, but still a win), but there the issue seems to be 
that with a file like that, we actually spend most of our time comparing 
different versions.

For the "git blame Makefile" case *all* of zlib combined is just 18%, 
while the ostensibly trivial "cmp_suspect()" is 23% and another 11% is 
from "assign_blame()" - so for top-level entries the costs would seem to 
tend to be in the blame algorithm itself, rather than in the actual object 
handling.

(I'm sure that could be improved too, but the take-home message from this 
is that zlib wasn't really the problem, and our stupid re-generation of 
the same delta base was.

			Linus
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]