Re: Git is not scalable with too many refs/*

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 09/08/2011 09:53 PM, Martin Fick wrote:
> Just thought that I should add some numbers to this thread as it seems that
> the later versions of git are worse off by several orders of magnitude on
> this one.  
> 
> We have a Gerrit repo with just under 100K refs in refs/changes/*.  When I
> fetch them all with git 1.7.6 it does not seem to complete.  Even after 5
> days, it is just under half way through the ref #s! [...]

I recently reported very slow performance when doing a "git
filter-branch" involving only about 1000 tags, with hints of O(N^3)
scaling [1].  That could certainly explain enormous runtimes for 100k refs.

References are cached in git in a single linked list, so it is easy to
imagine O(N^2) all over the place (which is bad enough for 100k
references).  I am working on improving the situation by reorganizing
how the reference cache is stored in memory, but progress is slow.

I'm not sure whether your problem is related.  For example, it is not
obvious to me why the commit that you cite (88a21979) would make the
reference problem so dramatically worse.

I suggest the following experiments to characterize the problem:

1. Fetch the references in batches of a few hundred each, and see if
that dramatically decreases the total time.

2. Same as (1), except run "git pack-refs --all --prune" between the
batches.  In my experiments, packing references made a dramatic
difference in runtimes.

3. Try using the --no-replace-objects option (I assume that it can be
used like "git --no-replace-objects fetch ...").  In my case this option
made a dramatic improvement in the runtimes.

4. Try a test using a repository generated something like the test
script that I posted in [1].  If it also gives pathologically bad
performance, then it can serve as a test case to use while we debug the
problem.

Yours,
Michael

[1] http://comments.gmane.org/gmane.comp.version-control.git/177103

-- 
Michael Haggerty
mhagger@xxxxxxxxxxxx
http://softwareswirl.blogspot.com/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]