Any tips for improving the performance of cloning large repositories?

Alex Bennee <kernel-hacker@xxxxxxxxxx> · Fri, 16 Dec 2011 13:02:12 +0000

Hi,

We've migrated our old CVS repository into GIT without too many
issues. However now we are rolling out the usage of the new repository
we are hitting some performance bottlenecks, especially on the initial
clone (something our buildbot instance does a lot).

Our repo is large, my .git is around 2.5G although the central repo
has a 1.7Gb single pack file. However some machines handle the cloning
better than others. For one thing the clone process seems to involve
the receiving side needing a large glob of memory which causes
problems when there is memory pressure.

I've tried tweaking the pack size from unlimited to 256m but this
seems to have increased the clone time as the receiving end attempts
to re-pack everything back into an uber-pack.

Another thing that I've noticed is very high systime on the receiving
machines as ethernet and disk I/O is heavily hit.

So what I'm looking for are some tips on how I can tweak
configurations to make the clone process a little less I/O and memory
heavy. Any suggestions?

One thing I did try was a rsync'ed local repo in /var/cache/repos
which the clone command used for reference with something like:

git clone --local --reference /var/cache/repo.git git://repo/repo.git

But that didn't help as it seems to copy the whole thing anyway.

-- 
Alex, homepage: http://www.bennee.com/~alex/
http://www.half-llama.co.uk
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html