On Fri, Mar 23, 2018 at 3:46 AM, Jeff King <peff@xxxxxxxx> wrote: > On Fri, Mar 23, 2018 at 01:28:12AM +0000, Ramsay Jones wrote: > >> > Of the used heap after your patches: >> > >> > - ~40% of that is from packlist_alloc() >> > - ~17% goes to "struct object" >> > - ~10% for the object.c hash table to store all the "struct object" >> > - ~7% goes to the delta cache >> > - ~7% goes to the pack revindex (actually, there's a duplicate 7% >> > there, too; I think our peak is when we're sorting the revindex >> > and have to keep two copies in memory at once) >> >> which begs the question, how much slower would it be if we >> replaced the radix-sort with an in-place sort (e.g. heapsort). >> >> I hacked up the patch below, just for fun. I don't have any >> large repos (or enough disk space) to do any meaningful perf >> tests, but I did at least compile it and it passes the test-suite. >> (That is no guarantee that I haven't introduced bugs, of course!) > > It might have been easier to just revert 8b8dfd5132 (pack-revindex: > radix-sort the revindex, 2013-07-11). It even includes some performance > numbers. :) > > In short, no, I don't think we want to go back to a comparison-sort. The > radix sort back then was around 4 times faster for linux.git. And that > was when there were half as many objects in the repository, so the radix > sort should continue to improve as the repo size grows. > > The absolute time savings aren't huge for something as bulky as a > repack, so it's less exciting in this context. But it's also not that > much memory (7% of the peak here, but as I showed elsewhere, if we can > stop holding all of the "struct object" memory once we're done with it, > then this revindex stuff doesn't even factor into the peak). We probably could do something about revindex after rev-list memory is freed up too. revindex is used in two places (i'm ignoring packfile.c): when we look for base objects in ofs-delta in check_objects() and when we write a reused object. The first case can't be helped, it's why we need revindex. The second case though is just to get the compressed object size so we can copy the object. If we cache the compressed size (like with uint32_t) then we can free up revindex after check_objects() is done. Even if we repack everything, this is still a very good saving (32 bits vs 128 bits per object). The freed up memory could probably be used for delta cache. But then if we hit a compressed object size larger than 4GB, revindex must be recreated back, but we could detect this at check_objects phase and keep revindex alivie. -- Duy