Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 23/03/18 02:46, Jeff King wrote:
> On Fri, Mar 23, 2018 at 01:28:12AM +0000, Ramsay Jones wrote:
> 
>>> Of the used heap after your patches:
>>>
>>>  - ~40% of that is from packlist_alloc()
>>>  - ~17% goes to "struct object"
>>>  - ~10% for the object.c hash table to store all the "struct object"
>>>  - ~7% goes to the delta cache
>>>  - ~7% goes to the pack revindex (actually, there's a duplicate 7%
>>>        there, too; I think our peak is when we're sorting the revindex
>>>        and have to keep two copies in memory at once)
>>
>> which begs the question, how much slower would it be if we
>> replaced the radix-sort with an in-place sort (e.g. heapsort).
>>
>> I hacked up the patch below, just for fun. I don't have any
>> large repos (or enough disk space) to do any meaningful perf
>> tests, but I did at least compile it and it passes the test-suite.
>> (That is no guarantee that I haven't introduced bugs, of course!)
> 
> It might have been easier to just revert 8b8dfd5132 (pack-revindex:
> radix-sort the revindex, 2013-07-11). It even includes some performance
> numbers. :)

But not as much fun! :)

> In short, no, I don't think we want to go back to a comparison-sort. The
> radix sort back then was around 4 times faster for linux.git. And that
> was when there were half as many objects in the repository, so the radix
> sort should continue to improve as the repo size grows.

Yes, I expected radix-sort to be faster.

> The absolute time savings aren't huge for something as bulky as a
> repack, so it's less exciting in this context. But it's also not that
> much memory (7% of the peak here, but as I showed elsewhere, if we can
> stop holding all of the "struct object" memory once we're done with it,
> then this revindex stuff doesn't even factor into the peak).

I didn't see this post until afterwards. So, if it isn't even a
factor in the peak memory use, then it's clear this specific
space/time trade-off also isn't an issue! ;-)

Thanks.

ATB,
Ramsay Jones



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux