Re: Git is not scalable with too many refs/*

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Friday, September 30, 2011 10:41:13 am Martin Fick wrote:
> massive fix to bring it down to 7.5mins was awesome.
> 7-8mins sounded pretty good 2 weeks ago, especially when
> a checkout took 5+ mins!  but now that almost every
> other operation has been sped up, that is starting to
> feel a bit on the slow side still.  My spidey sense
> tells me something is still not quite right in the fetch
> path.

I guess I overlooked that there were 2 sides to this 
equation.  Even though I have been doing my fetches locally, 
I was using the file:// protocol and it appears that the 
remote was running git 1.7.6 which was in my path the whole 
time.  So eliminating that from my path and pointing to the 
the "best" binary with all the fixes for both remote and 
local, the full fetch does indeed speed up quite a bit, it 
goes from about 7.5mins down to ~5m!  Previously the remote 
seemed to primarily spend the extra time after:

 remote: Counting objects: 316961

yet before:

 remote: Compressing objects


> Here is some more data to backup my spidey sense: after
> all the improvements, a noop fetch of all the changes
> (noop meaning they are all already uptodate) takes
> around 3mins with a non gced (non packed refs) case. 
> That same noop only takes ~12s in the gced (packed ref
> case)!

I believe (it is hard to be go back and be sure) that this 
means that the timings above which gave me 3mins were 
because the remote was using git 1.7.6.  Now, with the good 
binary, in both repos (packed and unpacked), I get great 
warm cache times of about 11-13s for a noop fetch.  It is 
interesting to note that cold cache times are 20s for packed 
refs and 1m30s for unpacked refs.  I guess that makes some 
sense.  

But, this does leave me thinking that packed refs should 
become the default and that there should be a config option 
to disable it?  This still might help a fetch?

Since a full sync is now done to about 5mins, I broke down 
the output a bit.  It appears that the longest part (2:45m) 
is now the time spent scrolling though each change still.  
Each one of these takes about 2ms:
 * [new branch]      refs/changes/99/71199/1 -> 
refs/changes/99/71199/1

Seems fast, but at about 80K... So, are there any obvious N 
loops over the refs happening inside each of of the [new 
branch] iterations?


-Martin

-- 
Employee of Qualcomm Innovation Center, Inc. which is a 
member of Code Aurora Forum
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]