On Tue, Oct 11, 2016 at 6:52 PM, Jeff King <peff@xxxxxxxx> wrote: > On Tue, Oct 11, 2016 at 09:34:28PM -0400, Jeff King wrote: > >> > Ok, time to present data... Let's assume a degenerate case first: >> > "up-to-date with all remotes" because that is easy to reproduce. >> > >> > I have 14 remotes currently: >> > >> > $ time git fetch --all >> > real 0m18.016s >> > user 0m2.027s >> > sys 0m1.235s >> > >> > $ time git config --get-regexp remote.*.url |awk '{print $2}' |xargs >> > -P 14 -I % git fetch % >> > real 0m5.168s >> > user 0m2.312s >> > sys 0m1.167s >> >> So first, thank you (and Ævar) for providing real numbers. It's clear >> that I was talking nonsense. >> >> Second, I wonder where all that time is going. Clearly there's an >> end-to-end latency issue, but I'm not sure where it is. Is it startup >> time for git-fetch? Is it in getting and processing the ref >> advertisement from the other side? What I'm wondering is if there are >> opportunities to speed up the serial case (but nobody really cared >> before because it doesn't matter unless you're doing 14 of them back to >> back). > > Hmm. I think it really might be just network latency. Here's my fetch > time: > > $ git config remote.origin.url > git://github.com/gitster/git.git > > $ time git fetch origin > real 0m0.183s > user 0m0.072s > sys 0m0.008s > > 14 of those in a row shouldn't take more than about 2.5 seconds, which > is still twice as fast as your parallel case. So what's going on? > > One is that I live about a hundred miles from GitHub's data center, and > my ping time there is ~13ms. The other side of the country, let alone > Europe, is going to be noticeably slower just for the TCP handshake. > > The second is that git:// is really cheap and simple. git-over-ssh is > over twice as slow: > > $ time git fetch git@xxxxxxxxxx:gitster/git > ... > real 0m0.432s > user 0m0.100s > sys 0m0.032s > > HTTP fares better than I would have thought, but is also slower: > > $ time git fetch https://github.com/gitster/git > ... > real 0m0.258s > user 0m0.080s > sys 0m0.032s > > -Peff Well 9/14 are https for me, the rest is git:// Also 9/14 (but a different set) is github, the rest is either internal or kernel.org. Fetching from github (https) is only 0.9s from here (SF bay area, I'm not in Europe any more ;) ) I would have expected to have a speedup of roughly 2 + latency gains. Factor 2 because in the current state of affairs either the client or the remote is working, i.e. the other sie is idle/waiting, so factor 2 seemed reasonable (and ofc the latency), so I was a bit surprised to see a higher yield.