Re: Restraining git pull/fetch to the current branch

Julian Phillips <julian@xxxxxxxxxxxxxxxxx> · Fri, 12 Jan 2007 14:08:28 +0000 (GMT)

On Thu, 11 Jan 2007, Junio C Hamano wrote:

Julian Phillips <julian@xxxxxxxxxxxxxxxxx> writes:

While trying out git on a large repository (10000s of commits, 1000s
of branches, ~2.5Gb when packed) at work I noticed that doing a pull
was taking a long time (longer than I was prepared to wait anyway).

Are they all real branches?  In other words, does your project
have 1000s of active parallel development?

(Oops, over enthusiastic with the 0 there, I mean 100s of branches, about 
880 atm).

They are mostly topic style branches, with only 20 or so in active use at 
any one time.  The idea of having to cope with 100s of active branches at 
the same time (given that we currently are using subversion) is quite 
frankly terrifying.

Also, assuming the answer to the above question is yes, will you
have 1000s of branches on your end and will work on any one of
them?

It would be necessary to have access to all of the currently active 
branches at least, with the added complication that the set of current 
active branches changes quite rapidly.

If you do not care all 1000s branches but only are interested in
selected few, you could change that configuration to suit your
needs better.

I think the problem here would be keeping track of which branches are 
currently active.  Some scheme could probably be derived, but I was hoping 
that fetching an unchanged branch would be sufficently fast that it would 
be necessary.  I appear to have been wrong :(

I suspect most of the time is being spent in the
append-fetch-head loop in fetch_main shell function in
git-fetch.sh The true fix would not be to limit the number of
branches updated, but to speed that part of the code up.

Indeed, each call to append_fetch_head is taking ~1.7s (~1.5s user, ~0.2s 
sys).  So simply looping over all the branches explains the ~27m that a 
complete fetch takes. (This is for fetch with no updates).  Given that a 
"clone orig new" takes ~8m30 (half of which would seem to be IO), it looks 
like it may be faster to create a new repository each time instead of 
updating the old one, which is certainly a viable workaround - but might 
imply that fetch has some room for improvement?

--
Julian

 ---
There is nothing stranger in a strange land than the stranger who comes
to visit.
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html