Re: Resumable clone/Gittorrent (again)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jan 6, 2011 at 1:07 AM, Luke Kenneth Casson Leighton
<luke.leighton@xxxxxxxxx> wrote:
> Âthe plan is to turn that variation in the git pack-objects responses,
> across multiple peers, into an *advantage* not a liability. Âhow?
> like this:
>
> Â* a client requiring objects from commit abcd0123 up to commit
> efga3456 sends out a DHT broadcast query to all and sundry who have
> commits abcd0123 and everything in between up to efga3456.
>
> Â* those clients that can be bothered to respond, do so [refinements below]
>
> Â* the requestor selects a few of them, and asks them to create git
> pack-objects. Âthis takes time, but that's ok. Âonce created, the size
> of the git pack-object is sent as part of the acknowledgement.
>
> Â* the requestor, on receipt of all the sizes, selects the *smallest*
> one to begin the p2p (.torrent) from (by asking the remote client to
> create a .torrent specifically for that purpose, with the filename
> abcd0123-ebga3456).

That defeats the purpose of distributing. You are putting pressure on
certain peers.

> Ânow, an immediately obvious refinement of this is that those .torrent
> (pack-objects) "stick around", in a cache (with a hard limit defined
> on the cache size of course). Âand so, when the client that requires a
> pack-object makes the request, of course, those remote clients that
> *already* have that cached pack-object for that specific commit-range
> should be given first priority, to avoid other clients from having to
> make massive amounts of git pack-objects.

Cache have its limits too. Suppose I half-fetch a pack then stop and
go wild for a month. The next month I restart the fetch, the pack may
no longer in cache. A new pack may or may not be identical to the old
pack.

Also if you go with packs, you are tied to the peer that generates
that pack. Two different peers can, in theory, generate different
packs (in encoding) for the same input.

Another thing with packs (ok, not exactly with packs) is how you
verify that's you have got what you asked. Bittorrent can verify every
piece a peer receives because sha-1 sum of those pieces are recorded
in .torrent file. We have SHA-1 all over the place, but if you don't
have base objects to undeltify, you can't use those SHA-1 to verify.
Verification is an important step before you advertise to other peers
"I have these".

> so, can you see that a) this is a far cry from the "simplistic
> transfer of blobs and trees" b) it's *not* going to overload peoples'
> systems by splattering (eek!) millions of md5 sums across the internet
> as bittorrent files c) it _does_ fit neatly into the bittorrent
> protocol d) it combines the best of git with the best of p2p
> distributed networking principles...

How can you advertise what you have to another peer?
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]