Re: [WIP RFC 2/5] Documentation: add Packfile URIs design doc

Christian Couder <christian.couder@xxxxxxxxx> · Fri, 22 Feb 2019 12:35:00 +0100

On Tue, Feb 19, 2019 at 9:10 PM Jonathan Tan <jonathantanmy@xxxxxxxxxx> wrote:
>
> > > Good points about SSH support and the client needing to control which
> > > protocols the server will send URIs for. I'll include a line in the
> > > client request in which the client can specify which protocols it is OK
> > > with.
> >
> > What if a client is ok to fetch from some servers but not others (for
> > example github.com and gitlab.com but nothing else)?
> >
> > Or what if a client is ok to fetch using SSH from some servers and
> > HTTPS from other servers but nothing else?
>
> The objects received from the various CDNs are still rehashed by the
> client (so they are identified with the correct name), and if the client
> is fetching from a server, presumably it can trust the URLs it receives
> (just like it trusts ref names, and so on). Do you know of a specific
> case in which a client wants to fetch from some servers but not others?

For example I think the Great Firewall of China lets people in China
use GitHub.com but not Google.com. So if people start configuring
their repos on GitHub so that they send packs that contain Google.com
CDN URLs (or actually anything that the Firewall blocks), it might
create many problems for users in China if they don't have a way to
opt out of receiving packs with those kind of URLs.

> (In any case, if this happens, the client can just disable the CDN
> support.)

Would this mean that people in China will not be able to use the
feature at all, because too many of their clones could be blocked? Or
that they will have to create forks to mirror any interesting repo and
 reconfigure those forks to work well from China?

> > I also wonder in general how this would interact with promisor/partial
> > clone remotes.
> >
> > When we discussed promisor/partial clone remotes in the thread
> > following this email:
> >
> > https://public-inbox.org/git/20181016174304.GA221682@xxxxxxxxxxxxxxxxxxxxxxxxx/
> >
> > it looked like you were ok with having many promisor remotes, which I
> > think could fill the same use cases especially related to large
> > objects.
> >
> > As clients would configure promisor remotes explicitly, there would be
> > no issues about which protocol and servers are allowed or not.
> >
> > If the issue is that you want the server to decide which promisor
> > remotes would be used without the client having to do anything, maybe
> > that could be something added on top of the possibility to have many
> > promisor remotes.
>
> It's true that there is a slight overlap with respect to large objects,
> but this protocol can also handle large sets of objects being offloaded
> to CDN, not only single ones.

Isn't partial clone also designed to handle large sets of objects?

> (The included implementation only handles
> single objects, as a minimum viable product, but it is conceivable that
> the server implementation is later expanded to allow offloading of sets
> of objects.)
>
> And this protocol is meant to be able to use CDNs to help serve objects,
> whether single objects or sets of objects. In the case of promisor
> remotes, the thing we fetch from has to be a Git server.

When we discussed the plan for many promisor remotes, Jonathan Nieder
(in the email linked above) suggested:

 2. Simplifying the protocol for fetching missing objects so that it
    can be satisfied by a lighter weight object storage system than
    a full Git server.  The ODB helpers introduced in this series are
    meant to speak such a simpler protocol since they are only used
    for one-off requests of a collection of missing objects instead of
    needing to understand refs, Git's negotiation, etc.

and I agreed with that point.

Is there something that you don't like in many promisor remotes?

> (We could use
> dumb HTTP from a CDN, but that defeats the purpose in at least one way -
> with dumb HTTP, we have to fetch objects individually, but with URL
> support, we can fetch objects as sets too.)