Re: How to still kill git fetch with too many refs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 07/02/2013 05:02 AM, Martin Fick wrote:
> I have often reported problems with git fetch when there are 
> many refs in a repo, and I have been pleasantly surprised 
> how many problems I reported were so quickly fixed. :) With 
> time, others have created various synthetic test cases to 
> ensure that git can handle many many refs.  A simple 
> synthetic test case with 1M refs all pointing to the same 
> sha1 seems to be easily handled by git these days.  However, 
> in our experience with our internal git repo, we still have 
> performance issues related to having too many refs, in our 
> kernel/msm instance we have around 400K.
> 
> When I tried the simple synthetic test case and could not 
> reproduce bad results, so I tried something just a little 
> more complex and was able to get atrocious results!!! 
> Basically, I generate a packed-refs files with many refs 
> which each point to a different sha1.  To get a list of 
> valid but unique sha1s for the repo, I simply used rev-list.  
> The result, a copy of linus' repo with a million unique 
> valid refs and a git fetch of a single updated ref taking a 
> very long time (55mins and it did not complete yet).  Note, 
> with 100K refs it completes in about 2m40s.  It is likely 
> not linear since 2m40s * 10 would be ~26m (but the 
> difference could also just be how the data in the sha1s are 
> ordered).
> 
> 
> Here is my small reproducible test case for this issue:
> 
> git clone 
> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> cp -rp linux linux.1Mrefs-revlist
> 
> cd linux
> echo "Hello" > hello ; git add hello ; git ci -a -m 'hello'
> cd ..
> 
> cd linux.1Mrefs-revlist
> git rev-list HEAD | for nn in $(seq 0 100) ; do for c in 
> $(seq 0 10000) ; do  read sha ; echo $sha refs/c/$nn/$c$nn ; 
> done ; done > .git/packed-refs

I believe this generates a packed-refs file that is not sorted
lexicographically by refname, whereas all Git-generated packed-refs
files are sorted.  There are some optimizations in refs.c for adding
references in order that might therefore be circumvented by your
unsorted file.  Please try sorting the file by refname and see if that
helps.  (You can do so by deleting one of the packed references; then
git will sort the remainder while rewriting the file.)

Michael

-- 
Michael Haggerty
mhagger@xxxxxxxxxxxx
http://softwareswirl.blogspot.com/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]