On Sat, 17 Mar 2007, Linus Torvalds wrote: > > Instead of always re-generating the delta bases (possibly over and over > and over again), just cache the last few ones. They often can get re-used. Not just to compare actual timings, this shows the difference in the traces I did. Remember, before we had: [torvalds@woody linux]$ grep Needs delta-base-trace | wc -l 469334 [torvalds@woody linux]$ grep Needs delta-base-trace |sort -u | wc -l 21933 and now with the simple cache, I get: [torvalds@woody linux]$ grep Needs delta-base-trace-new | wc -l 28688 [torvalds@woody linux]$ grep Needs delta-base-trace-new | sort -u | wc -l 21933 ie, we still re-generate some of the objects multiple times, but now, rather than generating them (on average) 20+ times each, we now generate them an average of just 1.3 times each. Which explains why the wall-time goes down by over a factor of two. Changing the (statically sized) cache from 256 entries to 1024 (and updating the hash function appropriately of course) gets the number down to 23953 delta-base lookups (the number of unique ones obviously stays the same), for an average of just 1.1 object generates per unique object, and also means that you occasionally get sub-second times for my test-case of logging drivers/usb/. It all also means that libz isn't really even the top entry in the profiles any more, although it's still pretty high. But the profile now says: samples % app name symbol name 41527 15.6550 git strlen 30215 11.3905 git inflate 27504 10.3685 git inflate_table 20321 7.6607 git find_pack_entry_one 16892 6.3680 git interesting 16259 6.1294 vmlinux __copy_user_nocache 16010 6.0355 git inflate_fast 9240 3.4833 git get_mode 8863 3.3412 git tree_entry_extract 7145 2.6935 git strncmp 7131 2.6883 git memcpy 6863 2.5872 git diff_tree 6113 2.3045 git adler32 4515 1.7021 git _int_malloc 3022 1.1392 git update_tree_entry ... (Adding up all of libz is still ~31%, but it's lower as a percentage *and* it's obviously a smaller percentage of a much lower absolute time, so the zlib overhead went down much more than any other git overheads did) In general, this all seems very cool. The patches are simple enough that I think this is very safe to merge indeed: the only question I have is that somebody should verify that the "struct packed_git *p" is stable over the whole lifetime of a process - so that we can use it as a hash key without having to invalidate hashes if we unmap a pack (I *think* we just unmap the virtual mapping, and "struct packed_git *" stays valid, but Junio should ack that for me). Here's the trivial patch to extend the caching to 1k entries if somebody cares. I don't know if the small added performance is worth it. Linus --- diff --git a/sha1_file.c b/sha1_file.c index a7e3a2a..372af60 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -1352,7 +1352,7 @@ static void *unpack_compressed_entry(struct packed_git *p, return buffer; } -#define MAX_DELTA_CACHE (256) +#define MAX_DELTA_CACHE (1024) static struct delta_base_cache_entry { struct packed_git *p; @@ -1367,8 +1367,8 @@ static unsigned long pack_entry_hash(struct packed_git *p, off_t base_offset) unsigned long hash; hash = (unsigned long)p + (unsigned long)base_offset; - hash += (hash >> 8) + (hash >> 16); - return hash & 0xff; + hash += (hash >> 10) + (hash >> 20); + return hash & (MAX_DELTA_CACHE-1); } static void *cache_or_unpack_entry(struct packed_git *p, off_t base_offset, - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html