I was reviewing this issue and have an updated attempt to solve the issue slightly differently. I think I have something working but would like to borrow extra sets of eyeballs. From: Junio C Hamano <junkio@xxxxxxx> Subject: [PATCH/RFC] upload-pack: stop "ack continue" when we know common commits for wanted refs To: Ralf Baechle <ralf@xxxxxxxxxxxxxx> cc: git@xxxxxxxxxxxxxxx, Linus Torvalds <torvalds@xxxxxxxx> Date: Fri, 26 May 2006 19:20:54 -0700 Message-ID: <7vfyiwi4xl.fsf@xxxxxxxxxxxxxxxxxxxxxxxx> When the downloader's repository has more roots than the server side has, the "have" exchange to figure out recent common commits ends up traversing the whole history of branches that only exist on the downloader's side. When the downloader is asking for newer commits on the branch that exists on both ends, this is totally unnecessary. This adds logic to the server side to see if the wanted refs can reach the "have" commits received so far, and stop issuing "ack continue" once all of them can be reached from "have" commits. The idea in the new implementation is to notice that the downloader sent "have" for an object we do not know about, and when we already have some "have" from them and some "want" are still not known if they are already reachable from these "have"s, we traverse the commit ancestry down to oldest "have"s so far (this is just a heuristic) to see if all of "want" have some common ancestor with the other side. When we know all "want" can be reachable by some "have" we have seen so far, we send "ACK continue" when the downloader sends a "have" that we do not have, to cause the downloader to stop traversing that futile branch which leads to the root we do not have. The code sits near the tip of "pu". I've started from a clone of git.git repository and tried to fetch "todo" branch from another clone that does not have anything but the "todo" branch. So the downloader has five extra roots (one for git.git itself, one for gitk, one for gitweb, and one each for htmldocs and manpages). # upstream is just "todo" branch and nothing else git clone -n git.git upstream cd upstream mv .git/refs trash mkdir -p .git/refs/heads .git/refs/tags echo 'ref: refs/heads/master' >.git/HEAD cat trash/heads/todo >.git/refs/heads/master git repack -a -d cd .. # downloader has up-to-date git.git but stale "todo" git clone -n git.git downloader cd downloader git checkout todo git reset --hard HEAD~30 git repack -a -d # try downloading things from upstream git fetch-pack -k -v ../upstream master 2>/var/tmp/new.out git fetch-pack -k -v --exec=old-git-upload-pack \ ../upstream master 2>/var/tmp/old.out It does send smaller number of "have"s than the current code, but I noticed that near the end of transfer, after it gets an "ACK continue" for a common commit on "todo" branch and an "ACK continue" for a not-common commit on "master" branch, it keeps sending the commits that are marked on the fetch-pack side as COMMON_REF (so the last ref sent is v0.99^0 commit), although upload-pack has told the downloader that whatever is reachable from "master" branch are commits both sides agreed are common, so I suspect it should not go down that path that far to reach v0.99^0 commit. I have a feeling that either get_rev() or mark_common() logic is not marking ancestors of commit that are known to be common properly. Does this ring a bell? - : send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html