Re: Resumable clone/Gittorrent (again)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Jan 9, 2011 at 3:34 AM, Nguyen Thai Ngoc Duy <pclouds@xxxxxxxxx> wrote:
> On Sun, Jan 9, 2011 at 12:21 AM, Luke Kenneth Casson Leighton
> <luke.leighton@xxxxxxxxx> wrote:
>> Âok - you haven't answered the question: are the chains perfectly
>> fixed identical sizes?
>
> No.
>
>> Âif so they can be slotted into the bittorrent protocol by simply
>> pre-selecting the size to match. Âwith the downside that if there are
>> a million such "chains" you now pretty much overwhelm the peers with
>> the amount of processing, network traffic and memory requirements to
>> maintain the "pieces" map.
>
> No, there are thousands of them only (less than 100k for repos I
> examined). It's precisely the reason I stay away from commits as
> pieces because commits can potentially go up to millions.

 ok - thousands is still a lot.  i recommend that you examine:

 * the heuristics algorithm in bittorrent for piece-selection
 * large repositories such as webkit (1.2gb) and the linux kernel (600mb)

 you still have to come up with a mapping from "chains" to "pieces".
in the bittorrent protocol the mapping is done *entirely* implicitly
and algorithmically.  the "meta" info in the .torrent contains
filenames and file lengths.  stack the files one after the other in a
big long data block, get a chopper and just go "whack, whack, whack"
at regular piece-long points, that's your "pieces".  so, reassembly is
a complete bitch, and picking just _one_ file to download rather than
the whole lot becomes a total pain.

why the bloody hell the bittorrent protocol doesn't just have a file
id i _really_ don't know, it would have made things a damn sight
easier.  anyway - if you're going to modify and "be inspired by" the
bittorrent protocol, you really should look at adding some sort of
"chain" identification - f*** the "chains"-to-"pieces" algorithm, just
add a unique chain id to the relevant bittorrent[-like] command.


>> Âif not then you now need to modify the bittorrent protocol to cope
>> with variable-length block sizes: the protocol only allows for the
>> last block to be of variable-length.
>
> Ah I see. I do not reuse bittorrent code out there. Just its ideas,
> adapted to git model.

 that's hard work and you're now into "unproven" territory.  the
successful R&D proof-of-concept code that i wrote i _deliberately_
stayed away from "adapting" a proven bittorrent protocol, and as a
result managed to get that proof-of-concept up and running within ...
i think it was... 3 days.  most of the time was spent arseing about
adding in a VFS layer into bittornado, in order to libratise it.

i mention that just to give you something to think about.  if you're
up to the challenge of writing your own p2p protocol, however, GREAT!
you'll become a world expert on _both_ peer-to-peer protocols _and_
git :)

 l.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]