Re: pack-object poor performance (with large number of objects?)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Oct 3, 2011 at 6:05 PM, Shawn Pearce <spearce@xxxxxxxxxxx> wrote:
> On Mon, Oct 3, 2011 at 07:43, Piotr Krukowiecki
> <piotr.krukowiecki@xxxxxxxxx> wrote:
>> I'm having poor git gc (pack-object) performance. Please read below
>> for details. What can I do to improve the performance/debug the reason
>> for the slowness? Should I leave the process running over night, or
>> should I stop it (for debugging)?
>> CCing people who posted some patches/benchmarks for pack-objects recently.
>>
>> git gc was first run automatically by git svn clone. It found 1544673
>> objects and worked for 50 minutes until I've killed it.
>>
>> Then I've run it by hand with --aggresive (because I've found on
>> Internet it helped in some cases). It found 1742200 objects this time.
>> At this moment it's been working for about 90 minutes.
>
> Packing time depends on a number of factors. One of them is the number
> of unpacked objects to process. With 1.7 million objects, yes, its
> going to take some time.

Any statistics how long it should take?


> Another factor is how much RAM you have on
> your system. Packing requires a lot of memory, especially with the
> --aggressive flag, as the packer tries up to 250 different
> combinations of two objects searching for a good delta compression
> format, and all 250 of those are typically in-memory at once. If you
> have insufficient physical RAM, the system will swap, unless you
> decrease the window size.

I have 4GB of RAM and not all was used so it certainly shouldn't be
swapping. The process was in 'D' state so I suppose the hard disk
might be the limiting factor.

I think I also disabled threading (I'll check tomorrow) - I suppose it
has impact on packing time too.

I'll re-run packing tomorrow with threading and check the memory
usage, is there anything else I can do?

>
>> The large number of unpacked objects is probably caused by me - I've
>> disabled auto gc when I was cloning from svn (I though it might speed
>> up things if it didn't repack several times during clone, only once
>> afterwards).
>
> Yes, this the reason `git svn` runs GC during its import. If you defer
> all of the repacking work until the end, with everything loose, it can
> take a very, very long time to repack. If you repack as you go, the
> incremental repacks are less expensive than a full repack, and the
> entire process will go faster overall.

I've learned the lesson :)

-- 
Piotr Krukowiecki
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]