Re: Continue git clone after interruption

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Aug 19, 2009, Sitaram Chamarty wrote:
> On Wed, Aug 19, 2009 at 12:15 AM, Jakub Narebski<jnareb@xxxxxxxxx> wrote:

> > There is another way which we can go to implement resumable clone.
> > Let's git first try to clone whole repository (single pack; BTW what
> > happens if this pack is larger than file size limit for given
> > filesystem?).  If it fails, client ask first for first half of of
> > repository (half as in bisect, but it is server that has to calculate
> > it).  If it downloads, it will ask server for the rest of repository.
> > If it fails, it would reduce size in half again, and ask about 1/4 of
> > repository in packfile first.
> 
> How about an extension where the user can *ask* for a clone of a
> particular HEAD to be sent to him as a git bundle?  Or particular
> revisions (say once a week) were kept as a single file git-bundle,
> made available over HTTP -- easily restartable with byte-range -- and
> anyone who has bandwidth problems first gets that, then changes the
> origin remote URL and does a "pull" to get uptodate?
> 
> I've done this manually a few times when sneakernet bandwidth was
> better than the normal kind, heh, but it seems to me the lowest impact
> solution.
> 
> Yes you'd need some extra space on the server, but you keep only one
> bundle, and maybe replace it every week by cron.  Should work fine
> right now, as is, with a wee bit of manual work by the user, and a
> quick cron entry on the server

This is a good idea, i think, and it can be implemented with various
amount of effort and changes to git, and various amount of seamless
integration.

1. Simplest solution: social (homepage).  Not integrated at all.

   On projects homepage, the one where there is described where project
   repository is and how to get it, you add a link to most recent bundle
   (perhaps in addition to most recent snapshot).  This bundle would be
   served as a static file via HTTP (and perhaps also FTP) by (any) web
   server that supports resuming (range requests).  Or you can make
   server generate bundles on demand, only when they are first requested.

   Most recent might mean latest tagged release, or it might mean daily
   snapshot^W bundle.

   This solution could be integrated into gitweb, either by generic 
   'latest bundle' link in project's README.html (or in site's 
   GITWEB_HOMETEXT, default indextext.html), or by having gitweb
   generate those links (and perhaps bundles as well) by itself.

2. Seamless solution: 'bundle' or 'bundles' capability.  Requires 
   changes to both server and client.

   If server supports (advertises) 'bundle' capability, it can serve
   list of bundles (as HTTP / FTP / rsync URLs) either at client request,
   or after (or before) list of refs if client requests 'bundle' 
   capability.

   If client has support for 'bundles' capability, it terminates 
   connection to sshd or git-daemon, and does ordinary resumable HTTP
   fetch using libcurl.  After bundle is downloaded fully, it clones
   from bundle, and does git-fetch with the same server as before,
   which would then have less to transfer.  Client has also to handle
   situation where bundle download is interrupted, and do not do cleanup,
   allowing for "git clone --continue".

3. Seamless solution: GitTorrent or its simplification: git mirror-sync.

   I think that GitTorrent (see http://git.or.cz/gitwiki/SoC2009Ideas)
   or even its simplification git-mirror-sync would include restartable
   cloning.  It is even among its intended features.  Also this would
   help to download faster via mirrors which can have faster and better
   network connection.

   But this would be most work.

You can implement solution 1. even now...
-- 
Jakub Narebski
Poland
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]