On Thu, Jan 6, 2011 at 1:07 AM, Luke Kenneth Casson Leighton <luke.leighton@xxxxxxxxx> wrote: > Âthe plan is to turn that variation in the git pack-objects responses, > across multiple peers, into an *advantage* not a liability. Âhow? > like this: > > Â* a client requiring objects from commit abcd0123 up to commit > efga3456 sends out a DHT broadcast query to all and sundry who have > commits abcd0123 and everything in between up to efga3456. > > Â* those clients that can be bothered to respond, do so [refinements below] > > Â* the requestor selects a few of them, and asks them to create git > pack-objects. Âthis takes time, but that's ok. Âonce created, the size > of the git pack-object is sent as part of the acknowledgement. > > Â* the requestor, on receipt of all the sizes, selects the *smallest* > one to begin the p2p (.torrent) from (by asking the remote client to > create a .torrent specifically for that purpose, with the filename > abcd0123-ebga3456). That defeats the purpose of distributing. You are putting pressure on certain peers. > Ânow, an immediately obvious refinement of this is that those .torrent > (pack-objects) "stick around", in a cache (with a hard limit defined > on the cache size of course). Âand so, when the client that requires a > pack-object makes the request, of course, those remote clients that > *already* have that cached pack-object for that specific commit-range > should be given first priority, to avoid other clients from having to > make massive amounts of git pack-objects. Cache have its limits too. Suppose I half-fetch a pack then stop and go wild for a month. The next month I restart the fetch, the pack may no longer in cache. A new pack may or may not be identical to the old pack. Also if you go with packs, you are tied to the peer that generates that pack. Two different peers can, in theory, generate different packs (in encoding) for the same input. Another thing with packs (ok, not exactly with packs) is how you verify that's you have got what you asked. Bittorrent can verify every piece a peer receives because sha-1 sum of those pieces are recorded in .torrent file. We have SHA-1 all over the place, but if you don't have base objects to undeltify, you can't use those SHA-1 to verify. Verification is an important step before you advertise to other peers "I have these". > so, can you see that a) this is a far cry from the "simplistic > transfer of blobs and trees" b) it's *not* going to overload peoples' > systems by splattering (eek!) millions of md5 sums across the internet > as bittorrent files c) it _does_ fit neatly into the bittorrent > protocol d) it combines the best of git with the best of p2p > distributed networking principles... How can you advertise what you have to another peer? -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html