On Wed, Aug 19, 2009, Sitaram Chamarty wrote: > On Wed, Aug 19, 2009 at 12:15 AM, Jakub Narebski<jnareb@xxxxxxxxx> wrote: > > There is another way which we can go to implement resumable clone. > > Let's git first try to clone whole repository (single pack; BTW what > > happens if this pack is larger than file size limit for given > > filesystem?). If it fails, client ask first for first half of of > > repository (half as in bisect, but it is server that has to calculate > > it). If it downloads, it will ask server for the rest of repository. > > If it fails, it would reduce size in half again, and ask about 1/4 of > > repository in packfile first. > > How about an extension where the user can *ask* for a clone of a > particular HEAD to be sent to him as a git bundle? Or particular > revisions (say once a week) were kept as a single file git-bundle, > made available over HTTP -- easily restartable with byte-range -- and > anyone who has bandwidth problems first gets that, then changes the > origin remote URL and does a "pull" to get uptodate? > > I've done this manually a few times when sneakernet bandwidth was > better than the normal kind, heh, but it seems to me the lowest impact > solution. > > Yes you'd need some extra space on the server, but you keep only one > bundle, and maybe replace it every week by cron. Should work fine > right now, as is, with a wee bit of manual work by the user, and a > quick cron entry on the server This is a good idea, i think, and it can be implemented with various amount of effort and changes to git, and various amount of seamless integration. 1. Simplest solution: social (homepage). Not integrated at all. On projects homepage, the one where there is described where project repository is and how to get it, you add a link to most recent bundle (perhaps in addition to most recent snapshot). This bundle would be served as a static file via HTTP (and perhaps also FTP) by (any) web server that supports resuming (range requests). Or you can make server generate bundles on demand, only when they are first requested. Most recent might mean latest tagged release, or it might mean daily snapshot^W bundle. This solution could be integrated into gitweb, either by generic 'latest bundle' link in project's README.html (or in site's GITWEB_HOMETEXT, default indextext.html), or by having gitweb generate those links (and perhaps bundles as well) by itself. 2. Seamless solution: 'bundle' or 'bundles' capability. Requires changes to both server and client. If server supports (advertises) 'bundle' capability, it can serve list of bundles (as HTTP / FTP / rsync URLs) either at client request, or after (or before) list of refs if client requests 'bundle' capability. If client has support for 'bundles' capability, it terminates connection to sshd or git-daemon, and does ordinary resumable HTTP fetch using libcurl. After bundle is downloaded fully, it clones from bundle, and does git-fetch with the same server as before, which would then have less to transfer. Client has also to handle situation where bundle download is interrupted, and do not do cleanup, allowing for "git clone --continue". 3. Seamless solution: GitTorrent or its simplification: git mirror-sync. I think that GitTorrent (see http://git.or.cz/gitwiki/SoC2009Ideas) or even its simplification git-mirror-sync would include restartable cloning. It is even among its intended features. Also this would help to download faster via mirrors which can have faster and better network connection. But this would be most work. You can implement solution 1. even now... -- Jakub Narebski Poland -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html