On Tue, Sep 10, 2024 at 06:29:59PM +0200, Christian Couder wrote: > diff --git a/Documentation/config/promisor.txt b/Documentation/config/promisor.txt > index 98c5cb2ec2..9cbfe3e59e 100644 > --- a/Documentation/config/promisor.txt > +++ b/Documentation/config/promisor.txt > @@ -1,3 +1,20 @@ > promisor.quiet:: > If set to "true" assume `--quiet` when fetching additional > objects for a partial clone. > + > +promisor.advertise:: > + If set to "true", a server will use the "promisor-remote" > + capability, see linkgit:gitprotocol-v2[5], to advertise the > + promisor remotes it is using, if it uses some. Default is > + "false", which means the "promisor-remote" capability is not > + advertised. > + > +promisor.acceptFromServer:: > + If set to "all", a client will accept all the promisor remotes > + a server might advertise using the "promisor-remote" > + capability. Default is "none", which means no promisor remote > + advertised by a server will be accepted. By accepting a > + promisor remote, the client agrees that the server might omit > + objects that are lazily fetchable from this promisor remote > + from its responses to "fetch" and "clone" requests from the > + client. See linkgit:gitprotocol-v2[5]. I wonder a bit about whether making this an option is all that sensible, because that would of course apply globally to every server that you might want to clone from. Wouldn't it be more sensible to make this configurabe per server? Another question: servers may advertise bogus addresses to us, and as far as I can see there are currently no precautions in place against malicious cases. The server might for example use this to redirect us to a remote that uses no encryption, the Git protocol or even the "file://" protocol. I guess the sane thing here would be to default to allow clones via "https://" only, but make the set of accepted protocols configurable. > diff --git a/Documentation/gitprotocol-v2.txt b/Documentation/gitprotocol-v2.txt > index 414bc625d5..65d5256baf 100644 > --- a/Documentation/gitprotocol-v2.txt > +++ b/Documentation/gitprotocol-v2.txt > @@ -781,6 +781,60 @@ retrieving the header from a bundle at the indicated URI, and thus > save themselves and the server(s) the request(s) needed to inspect the > headers of that bundle or bundles. > > +promisor-remote=<pr-infos> > +~~~~~~~~~~~~~~~~~~~~~~~~~~ > + > +The server may advertise some promisor remotes it is using or knows > +about to a client which may want to use them as its promisor remotes, > +instead of this repository. In this case <pr-infos> should be of the > +form: > + > + pr-infos = pr-info | pr-infos ";" pr-info Wouldn't it be preferable to make this multiple lines so that we cannot ever burst through the pktline limits? > + pr-info = "name=" pr-name | "name=" pr-name "," "url=" pr-url > + > +where `pr-name` is the urlencoded name of a promisor remote, and > +`pr-url` the urlencoded URL of that promisor remote. > +In this case, if the client decides to use one or more promisor > +remotes the server advertised, it can reply with > +"promisor-remote=<pr-names>" where <pr-names> should be of the form: One of the things that LFS provides is custom transfer types. It is for example possible to use NFS or some other arbitrary protocol to fetch or upload data. It should be possible to provide similar functionality on the Git side via custom transport helpers, too, and if we make the accepted set of helpers configurable as proposed further up this could be made safe, too. But one thing I'm missing here is any documentation around how the client would know which promisor-remote to pick when the remote advertises multiple of them. The easiest schema would of course be to pick the first one whose transport helper the client understands and considers to be safe. But given that we're talking about offloading of large blobs, would we have usecases for advertising e.g. region-scoped remotes that require more information on the client-side? Also, are the promisor remotes promising to each contain all objects? Or would the client have to ask each promisor remote until it finds a desired object? > + pr-names = pr-name | pr-names ";" pr-name > + > +where `pr-name` is the urlencoded name of a promisor remote the server > +advertised and the client accepts. > + > +Note that, everywhere in this document, `pr-name` MUST be a valid > +remote name, and the ';' and ',' characters MUST be encoded if they > +appear in `pr-name` or `pr-url`. So I assume the intent here is to let the client add that promisor remote with that exact, server-provided name? That makes me wonder about two different scenarios: - We must keep the remote from announcing "origin". - What if we eventually decide to allow users to provide their own names for remotes during git-clone(1)? Overall, I don't think that it's a good idea to let the remote dictate which name a client's remotes have. > +If the server doesn't know any promisor remote that could be good for > +a client to use, or prefers a client not to use any promisor remote it > +uses or knows about, it shouldn't advertise the "promisor-remote" > +capability at all. > + > +In this case, or if the client doesn't want to use any promisor remote > +the server advertised, the client shouldn't advertise the > +"promisor-remote" capability at all in its reply. > + > +The "promisor.advertise" and "promisor.acceptFromServer" configuration > +options can be used on the server and client side respectively to > +control what they advertise or accept respectively. See the > +documentation of these configuration options for more information. > + > +Note that in the future it would be nice if the "promisor-remote" > +protocol capability could be used by the server, when responding to > +`git fetch` or `git clone`, to advertise better-connected remotes that > +the client can use as promisor remotes, instead of this repository, so > +that the client can lazily fetch objects from these other > +better-connected remotes. This would require the server to omit in its > +response the objects available on the better-connected remotes that > +the client has accepted. This hasn't been implemented yet though. So > +for now this "promisor-remote" capability is useful only when the > +server advertises some promisor remotes it already uses to borrow > +objects from. In the cover letter you mention that the server may not even have some objects at all in the future. I wonder how that is supposed to interact with clients that do not know about the "promisor-remote" capability at all though.