Re: Resumable clone

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Junio & Duy,

On Sat, 5 Mar 2016, Junio C Hamano wrote:

> Duy Nguyen <pclouds@xxxxxxxxx> writes:
> 
> > Resumable clone is happening. See [1] for the basic idea, [2] and [3]
> > for some preparation work. I'm sure you can help. Once you've gone
> > through at least [1], I think you can pick something (e.g. finalizing
> > the protocol, update the server side, or git-clone....)
> >
> > [1] http://thread.gmane.org/gmane.comp.version-control.git/285921
> > [2] http://thread.gmane.org/gmane.comp.version-control.git/288080/focus=288150
> > [3] http://thread.gmane.org/gmane.comp.version-control.git/288205/focus=288222
> 
> I think your response needs to be refined with a bit higher level
> overview, though.  Here are some thoughts to summarize the discussion
> and to extend it.
> 
> I think the right way to think about this is that we are adding a
> capability for the server to instruct the clients: I prefer not to
> serve a full clone to you in the usual route if I can avoid it.  You
> can help me by going to an alternate resource and populate your
> history first and then coming back to me for an additional fetch to
> complete the history if you want to.  Doing so would also help you
> because that alternate resource can be a static file (or two) that
> you can download over a resumable transport (like static files
> served over HTTPS).

For quite some time I considered presenting some alternate/additional
ideas. I feel a little bad for mentioning them here because I *really*
have no time to follow up on them whatsoever. But maybe they turn out to
contribute something to the final solution.

I tried to follow the discussion as much as possible, sometimes failing
due to time constraints, therefore I'd like to apologize in advance if any
of these ideas have been mentioned already.

First of all: my main gripe with the discussed approach is that it uses
bundles. I know, I introduced bundles, but they just seem too klunky and
too static for the resumable clone feature.

So I wonder whether it would be possible to come up with a subset of the
revs with a stable order, with associated thin packs (using prior revs as
negative revs in the commit range) such that each thin pack weighs roughly
1MB (or whatever granularity you desire). My thinking was that it should
be possible to follow a similar strategy as bisect to come up with said
list.

The client could then state that it was interrupted at downloading a given
rev's pack, with a specific offset, and the (thin) pack could be
regenerated on the fly (or cached), serving only the desired chunk. The
server would then also automatically know where in the list of
stable-ordered revs the clone was interrupted and continue with the next
one.

Oh, and if regenerating the thin pack instead of caching it, we need to
ensure a stable packing (i.e. no threads!). That is, given a commit range,
we need to (re-)generate bytewise-identical thin packs.

Of course this stable-ordered rev list would have to be persisted when the
server serves its first resumable clone and then extended with future
resumable clones whenever new revisions were pushed. (And there would also
have to be some way to evict no-longer-reachable revs, maybe by simply
regenerating the whole shebang.)

For all of this to work, the most crucial idea would be this one: a clone
can *always* start as-is. Only when interrupted, and when the server
supports the "resumable clone" capability, and only when "resuming"
the clone, the client could *actually* ask for a resumable clone.

Yes, this could potentially waste a bit of bandwidth on the part of the
user with a flakey connection (because whatever was transferred during the
first, non-resumable clone would be blown out of the window), but it might
make it easier for us to provide a non-fragile upgrade path because the
cloning process would still default to the current one.

Food for thought?

Ciao,
Dscho
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]