Re: Git is not scalable with too many refs/*

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 09.09.2011 15:50, schrieb Michael Haggerty:
> On 09/08/2011 09:53 PM, Martin Fick wrote:
>> Just thought that I should add some numbers to this thread as it seems that
>> the later versions of git are worse off by several orders of magnitude on
>> this one.  
>>
>> We have a Gerrit repo with just under 100K refs in refs/changes/*.  When I
>> fetch them all with git 1.7.6 it does not seem to complete.  Even after 5
>> days, it is just under half way through the ref #s! [...]
> 
> I recently reported very slow performance when doing a "git
> filter-branch" involving only about 1000 tags, with hints of O(N^3)
> scaling [1].  That could certainly explain enormous runtimes for 100k refs.
> 
> References are cached in git in a single linked list, so it is easy to
> imagine O(N^2) all over the place (which is bad enough for 100k
> references).  I am working on improving the situation by reorganizing
> how the reference cache is stored in memory, but progress is slow.
> 
> I'm not sure whether your problem is related.  For example, it is not
> obvious to me why the commit that you cite (88a21979) would make the
> reference problem so dramatically worse.

88a21979 is the reason, as since then a "git rev-list <sha1> --not --all" is
run for *every* updated ref to find out all new commits fetched for that ref.
And if you have 100K of them ...
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]