On Sat, 4 Sep 2010, Luke Kenneth Casson Leighton wrote: > so, i believe that a much simpler algorithm is to follow nicolas' advice, and: > > * split up a pack-index file by its fanout (1st byte of SHAs in the idx) > * create SHA1s of the list of object-refs within an individual fanout > * compare the per-fanout SHA1s remote and local > * if same, deduce "oh look, we have that per-fanout list already" > * grab the per-fanout object-ref list using standard p2p filesharing > > in this way you'd end up breaking down e.g. 50mb of pack-index (for > e.g. linux-2.6.git) into rouughly 200k chunks, and you'd exchange > rouughly 50k of network traffic to find out that you'd got some of > those fanout object-ref-lists already. which is nice. Scrap that idea -- this won't work. The problem is that, by nature, SHA1 is totally random. So if you have, say, 256 objects to transfer (and 256 objects is not that much) then, statistically, the probability that the SHA1s for those objects end up uniformly distributed across all the 256 fanouts is quite high. the algorithm I mentioned completely breaks down in that case. Nicolas -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html