On Fri, 12 Jan 2007, Julian Phillips wrote:
I suspect most of the time is being spent in the
append-fetch-head loop in fetch_main shell function in
git-fetch.sh The true fix would not be to limit the number of
branches updated, but to speed that part of the code up.
Indeed, each call to append_fetch_head is taking ~1.7s (~1.5s user, ~0.2s
sys). So simply looping over all the branches explains the ~27m that a
complete fetch takes. (This is for fetch with no updates). Given that a
"clone orig new" takes ~8m30 (half of which would seem to be IO), it looks
like it may be faster to create a new repository each time instead of
updating the old one, which is certainly a viable workaround - but might
imply that fetch has some room for improvement?
I have had a chance to spend a little more time looking at this. It would
appear that the major culprit is show-ref.
Running "git show-ref --hash <ref>" takes ~1.7s, compared to 0.002s for
"cat $GIT_DIR/<ref>". If I add the following to the top of
append_fetch_head a null fetch takes 1m28s instead of ~27m.
local_head_=$(cat $GIT_DIR/$local_name_);
if [ "$head_" == "$local_head_" ]; then
return;
fi
Looking at the code for show-ref it appears to looks at all the refs to
find the one you ask for. This makes fetch O(n^2) in no of branches,
which would seem not strictly necessary - but then I am not really
familiar with the internal working of git. I noticed that the man page
for show-ref says that its use over direct access is encouraged, but as it
stands it is far too slow to be used in fetch when you have a large
many-branched repository...
(In looking at this I also discovered that if you have too many branches
then fetch will die with a too long command line error when calling
git-fetch-pack.)
--
Julian
---
I will never lie to you.
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html