On Thu, Sep 12, 2013 at 12:45:44PM +0000, Pyeron, Jason J CTR (US) wrote: > If the rules of engagement are change a bit, the server side can be release from most of its work (CPU/IO). > > Client does the following, looping as needed: > > Heads=server->heads(); > KnownCommits=Local->AllCommits(); > Missingblobs=[]; > Foreach(commit:heads) if (!knownCommits->contains(commit)) MissingBlobs[]=commit; > Foreach(commit:knownCommit) if (!commit->isValid()) MissingBlobs[]=commit->blobs(); > If (missingBlobs->size()>0) server->FetchBlobs(missingBlobs); That doesn't quite work. The client does not know the set of missing objects just from the commits. It knows the sha1 of the root trees it is missing. And then if it fetches those, it knows the sha1 of any top-level entries it is missing. And when it gets those, it knows the sha1 of any 2nd-level entries it is missing, and so forth. You can progressively ask for each level, but: 1. You are spending a round-trip for each request. Doing it per-object is awful (the dumb http walker will do this if the repo is not packed, and it's S-L-O-W). Doing it per-level would be better, but not great. 2. You are losing opportunities for deltas (or you are making the state the server needs to maintain very complicated, as it must remember from request to request which objects you have gotten that can be used as delta bases). 3. There is a lot of overhead in this protocol. The client has to mention each object individually by sha1. It may not seem like a lot, but it can easily add 10% to a clone (just look at the size of the pack .idx files versus the packfiles themselves). -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html