Nicolas Pitre <nico@xxxxxxx> writes: > On Sat, 8 Sep 2007, David Kastrup wrote: > >> In normal use cases, the performance wins are not overly impressive: >> we get something like 5-10% due to the slightly better locality of >> memory accesses using the packed structure. > > The gain is probably counterbalanced by the fact that you're copying > the whole index when packing it, which is unfortunate. It was a design choice (I don't particularly like it myself). An index is created once, used a dozen times. Doing the packing in-place implies using realloc on the finished index. I consider the likelihood of permanent memory fragmentation higher when doing that rather than when allocating a fresh area of the same size. Also a repack in-place is going to cost more operations and have quite more complicated code. > Also, could you provide actual test results backing your performance > claim? 5-10% is still not negligible. I did in my git repository for i in .git/objects/[0-9a-f][0-9a-f]/[0-9a-f]*;do echo $i;done|sed 's+.*ts/\(..\)/+\1+' > /tmp/objlist and then something like dak@lola:/usr/local/tmp/git$ time ./git-pack-objects </tmp/objlist --stdout|dd of=/dev/null 4099+2 records in 4100+1 records out 2099205 bytes (2.1 MB) copied, 3.99295 seconds, 526 kB/s real 0m4.023s user 0m3.812s sys 0m0.168s dak@lola:/usr/local/tmp/git$ time git-pack-objects </tmp/objlist --stdout|dd of=/dev/null 4099+2 records in 4100+1 records out 2099205 bytes (2.1 MB) copied, 4.18734 seconds, 501 kB/s real 0m4.218s user 0m4.012s sys 0m0.160s dak@lola:/usr/local/tmp/git$ repeatedly on a warm cache. Results were pretty much comparable: consistently my version was faster, but never more than 10%. >> - struct index_entry *entry, **hash; >> + struct unpacked_index_entry *entry, **hash; >> + struct index_entry *aentry, **ahash; > > What does the "a" stand for? array (as opposed to linked list). Was the first thing coming into my mind. Maybe I should have gone for "p" for packed, but I shied from it because it often is meant to imply "pointer". Alternatively I could replace "entry" with "uentry", but that affects more lines. What do you propose? >> + mem = index+1; > [...] >> + for (i=0; i<hsize; i++) { > [...] >> + for (entry=hash[i]; entry; entry=entry->next) > > Minor style nit: please add spaces around "+", "=", "<", etc. for > consistency. Can do. -- David Kastrup, Kriemhildstr. 15, 44793 Bochum - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html