On Sun, Jan 9, 2011 at 3:34 AM, Nguyen Thai Ngoc Duy <pclouds@xxxxxxxxx> wrote: > On Sun, Jan 9, 2011 at 12:21 AM, Luke Kenneth Casson Leighton > <luke.leighton@xxxxxxxxx> wrote: >> Âok - you haven't answered the question: are the chains perfectly >> fixed identical sizes? > > No. > >> Âif so they can be slotted into the bittorrent protocol by simply >> pre-selecting the size to match. Âwith the downside that if there are >> a million such "chains" you now pretty much overwhelm the peers with >> the amount of processing, network traffic and memory requirements to >> maintain the "pieces" map. > > No, there are thousands of them only (less than 100k for repos I > examined). It's precisely the reason I stay away from commits as > pieces because commits can potentially go up to millions. ok - thousands is still a lot. i recommend that you examine: * the heuristics algorithm in bittorrent for piece-selection * large repositories such as webkit (1.2gb) and the linux kernel (600mb) you still have to come up with a mapping from "chains" to "pieces". in the bittorrent protocol the mapping is done *entirely* implicitly and algorithmically. the "meta" info in the .torrent contains filenames and file lengths. stack the files one after the other in a big long data block, get a chopper and just go "whack, whack, whack" at regular piece-long points, that's your "pieces". so, reassembly is a complete bitch, and picking just _one_ file to download rather than the whole lot becomes a total pain. why the bloody hell the bittorrent protocol doesn't just have a file id i _really_ don't know, it would have made things a damn sight easier. anyway - if you're going to modify and "be inspired by" the bittorrent protocol, you really should look at adding some sort of "chain" identification - f*** the "chains"-to-"pieces" algorithm, just add a unique chain id to the relevant bittorrent[-like] command. >> Âif not then you now need to modify the bittorrent protocol to cope >> with variable-length block sizes: the protocol only allows for the >> last block to be of variable-length. > > Ah I see. I do not reuse bittorrent code out there. Just its ideas, > adapted to git model. that's hard work and you're now into "unproven" territory. the successful R&D proof-of-concept code that i wrote i _deliberately_ stayed away from "adapting" a proven bittorrent protocol, and as a result managed to get that proof-of-concept up and running within ... i think it was... 3 days. most of the time was spent arseing about adding in a VFS layer into bittornado, in order to libratise it. i mention that just to give you something to think about. if you're up to the challenge of writing your own p2p protocol, however, GREAT! you'll become a world expert on _both_ peer-to-peer protocols _and_ git :) l. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html