Re: [WIP RFC 2/5] Documentation: add Packfile URIs design doc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> > +This feature allows servers to serve part of their packfile response as URIs.
> > +This allows server designs that improve scalability in bandwidth and CPU usage
> > +(for example, by serving some data through a CDN), and (in the future) provides
> > +some measure of resumability to clients.
> 
> Without reading the remainder, this makes readers anticipate a few
> good things ;-)
> 
>  - "part of", so pre-generated constant material can be given from
>    CDN and then followed-up by "filling the gaps" small packfile,
>    perhaps?

Yes :-)

>  - The "part of" transmission may not bring the repository up to
>    date wrt to the "want" objects; would this feature involve "you
>    asked history up to these commits, but with this pack-uri, you'll
>    be getting history up to these (somewhat stale) commits"?

It could be, but not necessarily. In my current WIP implementation, for
example, pack URIs don't give you any commits at all (and thus, no
history) - only blobs. Quite a few people first think of the "stale
clone then top-up" case, though - I wonder if it would be a good idea to
give the blob example in this paragraph in order to put people in the
right frame of mind.

> > +If the client replies with the following arguments:
> > +
> > + * packfile-uris
> > + * thin-pack
> > + * ofs-delta
> 
> "with the following" meaning "with all of the following", or "with
> any of the following"?  Is there a reason why the server side must
> require that the client understands and is willing to accept a
> thin-pack when wanting to use packfile-uris?  The same question for
> the ofs-delta.

"All of the following", but from your later comments, we probably don't
need this section anyway.

> My recommendation is to drop the mention of "thin" and "ofs" from
> the above list, and also from the following paragraph.  The "it MAY
> send" will serve as a practical escape clause to allow a server/CDN
> implementation that *ALWAYS* prepares pregenerated material that can
> only be digested by clients that supports thin and ofs.  Such a server
> can send packfile-URIs only when all of the three are given by the
> client and be compliant.  And such an update to the proposed document
> would allow a more diskful server to prepare both thin and non-thin
> pregenerated packs and choose which one to give to the client depending
> on the capability.

That is true - we can just let the server decide. I'll update the patch
accordingly, and state that the URIs should point to packs with features
like thin-pack and ofs-delta only if the client has declared that it
supports them.

> > +Clients then should understand that the returned packfile could be incomplete,
> > +and that it needs to download all the given URIs before the fetch or clone is
> > +complete. Each URI should point to a Git packfile (which may be a thin pack and
> > +which may contain offset deltas).
> 
> weaken or remove the (parenthetical comment) in the last sentence,
> and replace the beginning of the section with something like
> 
> 	If the client replies with 'packfile-uris', when the server
> 	sends the packfile, it MAY send a `packfile-uris` section...
> 
> You may steal what I wrote in the above response to help the
> server-side folks to decide how to actually implement the "it MAY
> send a packfile-uris" part in the document.

OK, will do.

> OK, this comes back to what I alluded to at the beginning.  We could
> respond to a full-clone request by feeding a series of packfile-uris
> and some ref information, perhaps like this:
> 
> 	* Grab this packfile and update your remote-tracking refs
>           and tags to these values; you'd be as if you cloned the
>           project when it was at v1.0.
> 
> 	* When you are done with the above, grab this packfile and
>           update your remote-tracking refs and tags to these values;
>           you'd be as if you cloned the project when it was at v2.0.
> 
> 	* When you are done with the above, grab this packfile and
>           update your remote-tracking refs and tags to these values;
>           you'd be as if you cloned the project when it was at v3.0.
> 
> 	...
> 
> 	* When you are done with the above, here is the remaining
>           packdata to bring you fully up to date with your original
>           "want"s.
> 
> and before fully reading the proposal, I anticipated that it was
> what you were going to describe.  The major difference is "up to the
> packdata given to you so far, you'd be as if you fetched these" ref
> information, which would allow you to be interrupted and then simply
> resume, without having to remember the set of packfile-uris yet to
> be processed across a fetch/clone failure.  If you sucessfully fetch
> packfile for ..v1.0, you can update the remote-tracking refs to
> match as if you fetched back when that was the most recent state of
> the project, and then if you failed while transferring packfile for
> v1.0..v2.0, the resuming would just reissue "git fetch" internally.

The "up to" would work if we had the stale clone + periodic "upgrades"
arrangement you describe, but not when, for example, we just want to
separate large blobs out. If we were to insist on attaching ref
information to each packfile URI (or turn the packfiles into bundles),
it is true that we can have resumable fetch, although I haven't fully
thought out the implications of letting the user modify the repository
while a fetch is in progress. (What happens if the user wipes out their
object store in between fetching one packfile and the next, for
example?) That is why I only talked about resumable clone, not resumable
fetch.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux