Re: [PATCH 2/3] credential: teach `credential_from_url()` a non-strict mode

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Jonathan,

On Wed, 22 Apr 2020, Jonathan Nieder wrote:

> Johannes Schindelin wrote:
>
> > There was one call site, though, that really needed that leniency: when
> > parsing config settings a la `credential.dev.azure.com.useHTTPPath`.
>
> Thanks for tackling this.
>
> Can the commit message say a little more about the semantics and when
> someone would use this?
>
> Is it a shortcut for
>
> 	[credential "http://dev.azure.com";]
> 		useHttpPath = true
>
> 	[credential "https://dev.azure.com";]
> 		useHttpPath = true
>
> ?

I don't really want to be too verbose about this in _this_ commit message.
But I do see your point. This is my current version of the commit message:

    credential: optionally allow partial URLs in credential_from_url_gently()

    Prior to the fixes for CVE-2020-11008, we were _very_ lenient in what we
    required from a URL in order to parse it into a `struct credential`.
    That led to serious vulnerabilities.

    There was one call site, though, that really needed that leniency: when
    parsing config settings a la `credential.dev.azure.com.useHTTPPath`.
    Settings like this might be desired when users want to use, say, a given
    user name on a given host, regardless of the protocol to be used.

    In preparation for fixing that bug, let's add a parameter called
    `allow_partial_url` to the `credential_from_url_gently()` function and
    convert the existing callers to set that parameter to 0, i.e. do not
    change the existing behavior, just add the option to be more lenient.

    Please note that this patch does more than just reinstating a way to
    imitate the behavior before those CVE-2020-11008 fixes: Before that, we
    would simply ignore URLs without a protocol. In other words,
    misleadingly, the following setting would be applied to _all_ URLs:

            [credential "example.com"]
                    username = that-me

    The obvious intention is to match the host name only. With this patch,
    we allow precisely that: when parsing the URL with non-zero
    `allow_partial_url`, we do not simply return success if there was no
    protocol, but we simply leave the protocol unset and continue parsing
    the URL.

As you see, I added substantially more information there.

> > In preparation for fixing that regression, let's add a parameter
> > called `strict` to the `credential_from_url()` function and convert
> > the existing callers to enforce that strict mode.
>
> I suspect this would be easier to read squashed with patch 3.  That
> would also mean that the functionality and test coverage come at the
> same time.

On the contrary, I _really_ want them to be separate, so that it is easier
to review. I know that _I_ had an easier time looking over the patch that
introduces the non-strict mode, to make sure that it does not change a
thing in strict mode (and yes, v1 made that harder than necessary by _not_
using `strict ||` in the condition that guards the `c->host` assignment).

In short: it is my intention to keep the two patches separate, with the
main goal to make sure that I don't introduce any of those regressions
Peff is worried about.

> [...]
> > diff --git a/credential.c b/credential.c
> > index 64a841eddca..c73260ac40f 100644
> > --- a/credential.c
> > +++ b/credential.c
> > @@ -344,7 +344,7 @@ static int check_url_component(const char *url, int quiet,
> >  }
> >
> >  int credential_from_url_gently(struct credential *c, const char *url,
> > -			       int quiet)
> > +			       int strict, int quiet)
>
> The collection of flags makes me wonder whether it's time to use a
> single "flags" parameter with flags that are |ed together.  That way,
> call sites are easier to read without requiring cross-reference
> assistance to see which option each boolean parameter represents.

I resisted the temptation because I don't want this to be a big patch
series, but a fast one. The reports keep coming in, and the current plan
seems to be for GitHub Desktop to release another version on Monday, for
which I need to release a MinGit backport probably tomorrow, otherwise it
won't work timewise.

Meaning: I don't really want to do "nice to have" work like this flag.

> Alternatively, could the non-strict form be a separate public function
> that uses the same static helper that takes two boolean args?  That is,
> something like
>
> 	int credential_from_url_gently(struct credential *c, const char *url,
> 				       int quiet)
> 	{
> 		return parse_credential_url(c, url, 1, quiet);
> 	}
>
> 	int credential_from_url_nonstrict(struct credential *c, const char *url,
> 					  int quiet)
> 	{
> 		return parse_credential_url(c, url, 0, quiet);
> 	}

Looks good, but there are only three callers (and I don't expect more).
And the only caller interested in the lenient mode (read: the only
now-complicated call) lives in the very same file as the called function.
So it strikes me as quite a bit of an overkill to introduce two new
functions.

> [...]
> > @@ -357,12 +357,12 @@ int credential_from_url_gently(struct credential *c, const char *url,
> >  	 *   (3) proto://<user>:<pass>@<host>/...
> >  	 */
> >  	proto_end = strstr(url, "://");
> > -	if (!proto_end || proto_end == url) {
> > +	if (strict && (!proto_end || proto_end == url)) {
> >  		if (!quiet)
> >  			warning(_("url has no scheme: %s"), url);
> >  		return -1;
> >  	}
>
> When !strict, this means we are not requiring a protocol.  No other
> difference appears to be intended.

Almost. My intention was to handle missing host names, too: `/repo.git`
should also match `https://example.com/repo.git`.

The user name is optional already, anyway.

BTW I realized (while working on these patches) that it would probably be
a good idea to warn about passwords in these `credential.<URL>.<key>`
settings: the password will be ignored as far as `credential_match()` is
concerned, and they should probably not live in the config.

But for aforementioned reasons, I decided against including a patch that
makes that happen.

> [...]
> > @@ -382,8 +382,10 @@ int credential_from_url_gently(struct credential *c, const char *url,
> >  		host = at + 1;
> >  	}
> >
> > -	c->protocol = xmemdupz(url, proto_end - url);
> > -	c->host = url_decode_mem(host, slash - host);
> > +	if (proto_end && proto_end - url > 0)
> > +		c->protocol = xmemdupz(url, proto_end - url);
>
> What should happen when the protocol isn't present?  Does this mean
> callers will need to be audited to make sure they handle NULL?

Again, the sole caller of this lenient mode is the config parser which
then wants to match the (partial) URL against the actual URL, using this
function:

	int credential_match(const struct credential *want,
			     const struct credential *have)
	{
	#define CHECK(x) (!want->x || (have->x && !strcmp(want->x, have->x)))
		return CHECK(protocol) &&
		       CHECK(host) &&
		       CHECK(path) &&
		       CHECK(username);
	#undef CHECK
	}

The lenient parser is supposed to populate the `want` of this function. In
other words, the code _expects_ `protocol` (or for that matter, _any_
attribute of `want`) to be optional.

> > +	if (slash - url > 0)
> > +		c->host = url_decode_mem(host, slash - host);
>
> What should happen the URL starts with a slash?

Oh... I thought it was obvious that a partial URL starting with a slash
would refer to the path part only... Does that not make sense?

Ciao,
Dscho




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux