Re: git-fetch per-repository speed issues

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Mon, 3 Jul 2006, Keith Packard wrote:
> On Mon, 2006-07-03 at 16:14 -0700, Linus Torvalds wrote:
> > 
> > Well, you could use multiple branches in the same repository, even if they 
> > are totally unrealated. That would allow you to fetch them all in one go.
> 
> I'd like to avoid this; the hope is that most people won't ever need to
> look at most repositories; it would be somewhat like having glibc in the
> same repo as the kernel...

Sure, understood. I'm just saying that if you want to fetch in one go, 
it's one possibility.

However, your setup has something else seriously wrong.

> Yeah, I tried with the git protocol and it's a few seconds faster (about
> 14 seconds instead of 17). Ick.

That's -still- about 13 seconds too much.

> I think it might have something to do with the number of heads we're
> tracking.

It really shouldn't matter. You get all the heads in one go with a single 
connection, so if 32 heads takes 32 times longer, there's something wrong.

> > Also, one thing to try is to just do
> > 
> > 	strace -Ttt git-peek-remote ...
> 
> That's plenty fast, 0.410 seconds, with nothing ugly in the strace.

Ok, a "git fetch" really shouldn't take any longer than a single 
connection. However, the fact that you have 32 heads, and it takes pretty 
close to _exactly_ 32 times 0.410 seconds (32*0.410s = 13.1s) makes me 
suspect that "git fetch" is just broken and fetches one branch at a time. 

Which would be just stupid.

But look as I might, I see only that one "git-fetch-pack" in git-fetch.sh 
that should trigger. Once. Not 32 times. But your timings sure sound like 
it's doing a _lot_ more than it should.

Junio, any ideas?

Keithp, can you try this trivial patch? It _should_ say something like

	Fetching
	refs/heads/master
	refs/heads/...
	refs/heads/...
	...
	refs/heads/... from git://..../...

and more importantly, it should say so only once.

And then it should leave a "fetch.trace" file in your working directory, 
which should show where that _one_ thing spends its time.

		Linus

----
diff --git a/git-fetch.sh b/git-fetch.sh
index 48818f8..4739202 100755
--- a/git-fetch.sh
+++ b/git-fetch.sh
@@ -339,6 +339,8 @@ fetch_main () {
     ( : subshell because we muck with IFS
       IFS=" 	$LF"
       (
+	  echo "Fetching $rref from $remote" >&2
+	  strace -o fetch.trace -Ttt \
 	  git-fetch-pack $exec $keep --thin "$remote" $rref || echo failed "$remote"
       ) |
       while read sha1 remote_name
-
: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]