Re: [RFC PATCH] Introduce git-hive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Aug 31, 2010 at 05:47:42PM +0100, Luke Kenneth Casson Leighton wrote:
> On Tue, Aug 31, 2010 at 4:52 PM, Casey Dahlin <cdahlin@xxxxxxxxxx> wrote:
> > Bittorrent has the luxury of being able to proxy for the poor firewall-bound
> > users since as long as there's one peer exposed to the internet you can have
> > any two other peers connect to him and give him the data they want to exchange,
> > to the benefit of all 3. Git won't work that way because not everyone in the
> > swarm wants all chunks of data, so if you found a proxy node, you might have to
> > make him carry data (possibly lots of data) that he has no personal interest
> > in.
> 
>  ah, no you don't.
> 
>  but - think about it: even if they don't, if they don't want the set
> of commits that get you up to a particular HEAD or other tag or
> branch, what the hell are they doing?? :)  from what i can gather, git
> simply doesn't operate in a way where you can "pick and choose" which
> commits you are and are not going to keep around, in order to
> reconstruct the repository.
> 
>  i hope that's right, because i'm counting on it.  i.e i'm counting on
> the following being true:
> 
>  "all copies of all git repositories have exactly the same objects
> such that git pack-object on the exact same ref and the exact same
> object ref will return exactly the same information".
> 
>  if anyone knows a reason why that is NOT the case, please could you tell me!
> 

Commits are always the same for everyone (though two commits you might think of
as "the same commit" may not be in git terms). Branches are pretty much always
subsets or supersets of oneanother. Repositories? Essentially snowflakes.

Try the kernel: Linus has a branch. Most individuals' repos are going to have
some subset of that branch checked out under various names. Not everyone will
have all of the commits though, nor will they necessarily want them just now.
Most people will have no ref pointing to the top commit of the Linus branch; it
will be a couple of commits down due to new things they've added on top. Refs
are NOT resource identifiers, and certainly not global resource identifiers.

Now consider the networking people. They have their own branch. It attaches to
the linus branch somewhere below the tip linus has published (probably at a
revision tag) and includes several unique commits. All the networking people
want those commits. None of the rest of people do. There's a btrfs branch too.
Some people want that, some don't. Some will have the networking and btrfs
branches. Most will have added commits on top privately.

Now consider share by email, where the same commit may appear in several
slightly different forms with different SHA 1s.

In summary, no. Repositories are balls of independently produced content. A
good peer to peer network needs to let any one client individually address
everything out there, but still take advantage of those instances where lots of
people do have copies of the same object.

--CJD
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]