Re: [PATCH 3/4] Add 'promisor-remote' capability to protocol v2

Christian Couder <christian.couder@xxxxxxxxx> · Tue, 20 Aug 2024 13:32:30 +0200

On Wed, Jul 31, 2024 at 6:16 PM Taylor Blau <me@xxxxxxxxxxxx> wrote:
>
> On Wed, Jul 31, 2024 at 03:40:13PM +0200, Christian Couder wrote:
> > diff --git a/Documentation/gitprotocol-v2.txt b/Documentation/gitprotocol-v2.txt
> > index 414bc625d5..4d8d3839c4 100644
> > --- a/Documentation/gitprotocol-v2.txt
> > +++ b/Documentation/gitprotocol-v2.txt
> > @@ -781,6 +781,43 @@ retrieving the header from a bundle at the indicated URI, and thus
> >  save themselves and the server(s) the request(s) needed to inspect the
> >  headers of that bundle or bundles.
> >
> > +promisor-remote=<pr-infos>
> > +~~~~~~~~~~~~~~~~~~~~~~~~~~
> > +
> > +The server may advertise some promisor remotes it is using, if it's OK
> > +for the server that a client uses them too. In this case <pr-infos>
> > +should be of the form:
> > +
> > +     pr-infos = pr-info | pr-infos ";" pr-info
>
> So <pr-infos> uses the ';' character to delimit multiple <pr-info>s,
> which means that <pr-info> can't use ';' itself. You mention above that
> <pr-info> is supposed to be generic so that we can add other fields to
> it in the future. Do you imagine that any of those fields might want to
> use the ';' in their values?

Yeah, but, as for the 'url' field where the value is urlencoded, the
value can be encoded if it could contain some special characters.

> One that comes to mind is the shared token example you wrote about
> above. It would be nice to not restrict what characters the token can
> contain.

I agree but I think urlencoding, or maybe other simple encodings like
base64, should be easy and simple enough to work around this.

> I wonder if it would instead be useful to have <pr-infos> first write
> out how many <pr-info>s it contains, and then write out each <pr-info>
> separated by a NUL byte, so that none of the files in the <pr-info>
> itself are restricted in what characters they can use.

I am not sure how NUL bytes would interfere with the pkt-line.[c,h] code though.

> > +static void promisor_info_vecs(struct repository *repo,
> > +                            struct strvec *names,
> > +                            struct strvec *urls)
> > +{
> > +     struct promisor_remote *r;
> > +
> > +     promisor_remote_init(repo);
> > +
> > +     for (r = repo->promisor_remote_config->promisors; r; r = r->next) {
> > +             char *url;
> > +             char *url_key = xstrfmt("remote.%s.url", r->name);
> > +
> > +             strvec_push(names, r->name);
> > +             strvec_push(urls, git_config_get_string(url_key, &url) ? NULL : url);
>
> Do you mean to push NULL onto urls here? It seems risky since you have
> to check that each entry in the strvec is non-NULL before printing it
> out (which you do below in promisor_remote_info()).

The code doesn't seem risky to me as it allows us to treat the case
when git_config_get_string() fails and when it succeeds but possibly
sets 'url' to NULL (not sure if it's possible though as I didn't
check) in the same way.

Yeah, it means that we have to check if each entry in the strvec is
non-NULL, but I think it's quite easy, and honestly I didn't want to
ask myself questions like should we treat an URL of a remote
configured as an empty string in the same way as the URL not
configured. I think it's much simpler to just pass as-is the content,
if any, that we get from git_config_get_string().

> Or maybe you need to in order to advertise promisor remotes without
> URLs?

Yeah, I think we should advertise promisor remotes that don't have an
URL configured. It might seem strange, but maybe servers might want in
the future to have hidden/secret URLs (URLs that they use, likely
internally on the server, but don't want to pass for some reason to a
client).

> If so, I'm not sure what the benefit would be to the client if it
> doesn't know where to go to retrieve any objects without having a URL.

The client might already have an URL for the promisor-remote (and it
might be a different one than the one the server would use if
hidden/secret URLs become a thing). That's why patch 4/4 implements
the "KnownName" value for the "promisor.acceptFromServer" config
option.