Re: q: git-fetch a tad slow?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ingo Molnar <mingo@xxxxxxx> wrote:
> 
> Setup/background: distributed kernel testing cluster, [...]
> 
> Problem: i noticed that git-fetch is a tad slow:
> 
>   titan:~/tip> time git-fetch
>   real    0m2.372s
> 
> There are hundreds of branches, so i thought fetching a single branch 
> alone would improve things:
> 
>   titan:~/tip> time git-fetch origin master
>   real    0m0.942s
>
> But that's still slow - so i use a (lame) ad-hoc script instead:
> 
>   titan:~/tip> time tip-fetch
>   real    0m0.246s

OK, yes, when there are _many_ branches like that limiting fetch
to a narrow focus of only the branch(es) you must have can make it
go much faster.  Part of the problem is we loop over the branches
many times, and those are O(N) loops (N=number of branches).  We
could do better, but we don't.

One reason why your tip-fetch runs so much better is because we don't
have to enumerate the hundreds of advertised branches offered up by
the remote peer to find the one you want to fetch.  Your tip-fetch
is reading only that one ref file (.git/refs/heads/master) and
that's pretty much it.

In contrast git-upload-pack on the server side must open and read
_all_ ref files under .git/refs/ and send them to the client, who
then has to loop over them at least twice before it can decide if
a match exists.  That's a lot more data to shove down over SSH.
Granted its only 42 bytes + refname per ref, but its still more.

Those O(N) loops I referred to earlier can explain why for hundreds
of branches it gets ugly.  That turns into an O(N^2) matching
algorithm.  Not pretty.  A simple hash would solve a lot of that,
changing the first time from 0m2.372s to much closer to the scond
time of 0m0.942s.

Neither of which can compete with your tip-fetch.

Have you tried using git-pack-refs to pack the branches on the
remote repository?

If you update all of the branches, run `git pack-refs --all --prune`,
then allow the testing clients to start fetching it may go much
quicker.  The pack-refs moves all of the individual ref files into
the single .git/packed-refs file, reducing the number of files we
need to open and read to service a single fetch client.

I wonder if git-pack-refs + fetching only a single branch will get
you closer to the tip-fetch time.

Also, I wonder if you really need to fetch over SSH.  Doing a
fetch over git:// is much quicker, as there is no SSH session
setup overheads.

-- 
Shawn.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux