Re: Resumable clone/Gittorrent (again) - stable packs?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 08/01/11 07:52, Ilari Liusvaara wrote:
> Ability to contact multiple servers in sequence, each time advertising
> everything obtained so far. Then treat the new repo as clone of the last
> address.
>
> This would e.g. be very handy if you happen to have local mirror of say, Linux
> kernel and want to fetch some related project without messing with alternates
> or downloading everything again:
>
> git clone --use-mirror=~/repositories/linux-2.6 git://foo.example/linux-foo
>
> This would first fetch everything from local source and then update that
> from remote, likely being vastly faster.

Coming to this discussion a little late, I'll summarise the previous
research.

First, the idea of applying the straight BitTorrent protocol to the pack
files was raised, but as Nicolas mentions, this is not useful because
the pack files are not deterministic.  The protocol was revisited based
around the part which is stable, object manifests.  The RFC is at
http://utsl.gen.nz/gittorrent/rfc.html and the prototype code (an
unsuccessful GSoC project) is at http://repo.or.cz/w/VCS-Git-Torrent.git

After some thought, I decided that the BitTorrent protocol itself is all
cruft and that trying to cut it down to be useful was a waste of time. 
So, this is where the idea of "automatic mirroring" came from.  With
Automatic Mirroring, the two main functions of P2P operation - peer
discovery and partial transfer - are broken into discrete features.

I wrote this patch series so far, for "client-side mirroring":

http://thread.gmane.org/gmane.comp.version-control.git/133626/focus=133628

The later levels are roughly discussed on this page:

http://code.google.com/p/gittorrent/wiki/MirrorSync

The "mirror sync" part is the complicated one, and as others have noted
no truly successful prototype has yet been built.  Actually the Perl
gittorrent implementation did manage to perform an incremental clone; it
just didn't wrap it up nicely.  But I won't go into that too much. 
There was also another GSoC program to look at caching the object list
generation, the most expensive part of the process in the Perl
implementation.  This was a generic mechanism for accelerating object
graph traversal and showed promise, however unfortunately was never merged.

The client-side mirroring patch, in its current form, already supports
out-of-date mirrors.  It saves refs first into
'refs/mirrors/hostname/...' and finally contacts the main server to
check what objects it is still missing.  So, if there was a regular
bittorrent+bundle transport available, it would be a useful way to
support an incremental clone; the client would first clone the (static)
bittorrent bundle, unpack it with its refs into the 'refs/mirrors/xxx/'
namespace, making the subsequent 'git fetch' to get the most recent
objects a much more efficient operation.

Hope that helps!

Cheers,
Sam
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]