Re: http://tech.slashdot.org/comments.pl?sid=1885890&cid=34358134

Will Palmer <wmpalmer@xxxxxxxxx> · Tue, 30 Nov 2010 09:43:50 +0000

On Mon, 2010-11-29 at 11:27 -0800, J.H. wrote:
> > I want my version control software to use p2p concepts for efficiency. I
> > don't want my version control software to be a p2p client any more than
> > I want my text-editor to be a mail client.
> 
> Keep in mind that adding p2p concepts to something doesn't make it more
> efficient, in fact in most cases it makes it dramatically *LESS* efficient.

As a simple use-case:
Everyone comes into an office in the morning and runs "git remote
update". This potentially causes a lot of traffic between the office and
their offsite central repository. If this were a p2p scenario, the
transfer from the offsite could potentially happen only once.

That counts as "more efficient", to me.

> 
> git-torrent like concepts have come up in the past, and I keep pointing
> out how and why they likely won't be useful.  The biggest reason: there
> is no advantage for a client to stay in the cloud once they have their
> data.  

This is true of bittorrent as well: People stay in the cloud for
altruistic reasons.

You're thinking p2p in terms of "every peer serves data, and keeps
serving data. The network is more robust over time". I'm thinking p2p in
terms of "Every peer has the ability to serve data. Adding a server is
as trivial as adding a client."

The greatest advantage I can think of is "no existing server needs to
agree to the addition of a new server", or at least "the addition of an
existing server is accepted by convention, no questions asked"

> ..... You can force this, sure, but clones are seldom and rare to begin
> with (as you mentioned) so there won't be a very large cloud to pull
> from to start with.  With that in mind, I really see no advantage to p2p
> inside of the git core at all.  It adds a lot of complexity for little gain.

I agree that git itself is not a good place to explore p2p concepts. I
assume it would be much more useful to develop an independent p2p layer
and allow git to somehow use that.
And while there won't be "a large cloud", it's almost a guarantee that
there would be more than the /one/ server currently available during
clones or long fetches.

> Now you do mention things that would be useful:
> 
> - Ability to resume a clone that you only have a partial download for
> (maybe just pack files?)
> - Ability to include something like a 'meta-link' like list of
> repositories to check for data (inferred from the multiple download
> locations)
> 
> There are things we can learn from p2p, but I don't think adding it to
> git is actually useful.
> 
> Just my $0.02 though.
> 
> - John 'Warthog9' Hawley
> 

The biggest hurdle, I assume, would be "get git to talk to more than one
source of data at once", even if one needs to set those up manually. If
I understand correctly, packfiles are generated in a way which would not
necessarily be consistent between clones.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html