Ingo Molnar <mingo@xxxxxxx> wrote: > > Setup/background: distributed kernel testing cluster, [...] > > Problem: i noticed that git-fetch is a tad slow: > > titan:~/tip> time git-fetch > real 0m2.372s > > There are hundreds of branches, so i thought fetching a single branch > alone would improve things: > > titan:~/tip> time git-fetch origin master > real 0m0.942s > > But that's still slow - so i use a (lame) ad-hoc script instead: > > titan:~/tip> time tip-fetch > real 0m0.246s OK, yes, when there are _many_ branches like that limiting fetch to a narrow focus of only the branch(es) you must have can make it go much faster. Part of the problem is we loop over the branches many times, and those are O(N) loops (N=number of branches). We could do better, but we don't. One reason why your tip-fetch runs so much better is because we don't have to enumerate the hundreds of advertised branches offered up by the remote peer to find the one you want to fetch. Your tip-fetch is reading only that one ref file (.git/refs/heads/master) and that's pretty much it. In contrast git-upload-pack on the server side must open and read _all_ ref files under .git/refs/ and send them to the client, who then has to loop over them at least twice before it can decide if a match exists. That's a lot more data to shove down over SSH. Granted its only 42 bytes + refname per ref, but its still more. Those O(N) loops I referred to earlier can explain why for hundreds of branches it gets ugly. That turns into an O(N^2) matching algorithm. Not pretty. A simple hash would solve a lot of that, changing the first time from 0m2.372s to much closer to the scond time of 0m0.942s. Neither of which can compete with your tip-fetch. Have you tried using git-pack-refs to pack the branches on the remote repository? If you update all of the branches, run `git pack-refs --all --prune`, then allow the testing clients to start fetching it may go much quicker. The pack-refs moves all of the individual ref files into the single .git/packed-refs file, reducing the number of files we need to open and read to service a single fetch client. I wonder if git-pack-refs + fetching only a single branch will get you closer to the tip-fetch time. Also, I wonder if you really need to fetch over SSH. Doing a fetch over git:// is much quicker, as there is no SSH session setup overheads. -- Shawn. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html