Re: [PATCH v3] config: add support for http.<url>.* settings

Jeff King <peff@xxxxxxxx> · Mon, 15 Jul 2013 00:43:38 -0400

On Fri, Jul 12, 2013 at 06:07:35AM -0700, Kyle J. McKay wrote:

> >It looks like you're matching the URLs as raw strings, and I don't see
> >any canonicalization going on. What happens if I have
> >"https://example.com/foo+bar"; in my config, but then I visit
> >"https://example.comfoo%20bar";?
> 
> The documentation for http.<url>.* says:

Yes, I know. My question was more rhetorical, as in "what _should_
happen". It seems unfriendly not to do at least some basic
canonicalization with respect to encodings.

> >That does make your by-length ordering impossible, but I don't
> >think you
> >can do it in the general case. If I have:
> >
> > [http "http://peff@xxxxxxxxxxx";] foo = 1
> > [http "http://example.com:8080";] foo = 2
> >
> >and I visit "http://peff@xxxxxxxxxxx:8080";, which one is the winner?
> 
> If I were to spilt everything out, then I would only have the second
> one match.  The first config is on a different port, quite possibly
> an entirely different service.  It shouldn't match.

Yeah, sorry, that was a bad example, because a missing port is not
"do not care", but rather "default port". A better example is:

  [http "http://peff@xxxxxxxxxxx/";] foo = 1
  [http "http://example.com/path/";] foo = 2

If we see the URL "http://peff@xxxxxxxxxxx/path/foo";, I would expect the
second one to match (it does not under a pure-prefix scheme). If we were
to make it match, you cannot say which of the two entries is "more
specific" they are both specific in different ways.

> I don't think it's necessary to split the URL apart though.  Whatever
> URL the user gave to git on the command line (at some point even if
> it's now stored as a remote setting in config) complete with URL-
> encoding, user names, port names, etc. is the same url, possibly
> shortened, that needs to be used for the http.<url>.option setting.

I'm not sure I agree with this. The URL handed to git is not always
typed directly by the user. E.g., it might be cut-and-paste from an
email or website, or it may even be fed by a script (e.g., the "repo"
tool will pull URLs from its manifest file).

> I think that's simple and very easy to explain and avoids user
> confusion and surprise while still allowing a default to be set for a
> site but easily overridden for a portion of that site without needing
> to worry about the order config files are processed or the order of
> the [http "<url>"] sections within them.

But the user has to worry about last-one-wins anyway for same-length
prefixes, as you noted below.

> >using the usual "last one wins" strategy that our config uses. It has
> >the unfortunate case that:
> >
> > [http "http://example.com";]
> >    foo = 1
> > [http]
> >    foo = 2
> >
> >will always choose http.foo=2, but I don't think that is a big problem
> >in practice.
> 
> I think that violates the principle of least surprise.  In this case:
> 
> [http "https://cacert.org";]
>   sslVerify = false
> [http]
>   sslVerify = true

Sure, I think yours is less surprising in this case. But it is more
surprising in other cases, like ones where URL encoding or optional
components are involved.  E.g., imagine your two options above were
flipped (you do not usually verify ssl, but it is very important for you
to do so against cacert.org). An automated tool like repo tries to clone
from https://user@xxxxxxxxxx, and you might be surprised that SSL
verification is not turned on.

> >>-	git_config(http_options, NULL);
> >>+	git_config(http_options, (void *)url);
> >
> >No need to cast again. :)
> 
> Ah, but there is in order to avoid a warning:

Ah, you're right. Sorry for the noise.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html