On Fri, Feb 14, 2020 at 10:59:29PM +0000, brian m. carlson wrote: > From: brian m. carlson <bk2204@xxxxxxxxxx> > > In some cases, a user will want to use a specific credential helper for > a wildcard pattern, such as https://*.corp.example.com. Use the urlmatch > code to match a pattern for a URL to allow this behavior. > > Since we are handling URLs using urlmatch, remove the custom code to > match URLs, since it is no longer needed. I think it's a good idea to unify the matching for these two subsystems if we can. Playing devil's advocate for a moment: - are there cases that would have matched before that won't now? Basic stuff like: echo url=https://example.com/foo | GIT_TERMINAL_PROMPT=0 git \ -c credential.https://example.com/foo.helper='!echo >&2 full helper' \ -c credential.https://example.com/fo.helper='!echo >&2 partial helper' \ credential fill seems to behave the same (we respect only the full match both before and after your patch). I wondered if we might run into problems matching usernames. But this: echo url=https://user@xxxxxxxxxxx/ | GIT_TERMINAL_PROMPT=0 git \ -c credential.https://example.com.helper='!echo >&2 bare helper' \ -c credential.https://user@xxxxxxxxxxx.helper='!echo >&2 user helper' \ credential fill seems to be behave the same both before and after (it runs both helpers). And this: echo url=https://example.com/ | GIT_TERMINAL_PROMPT=0 git \ -c credential.https://example.com.helper='!echo >&2 bare helper' \ -c credential.https://user@xxxxxxxxxxx.helper='!echo >&2 user helper' \ credential fill likewise behaves the same before and after, running only the bare helper (I think we _could_ match the user-helper by doing a second pass over the config after getting the username from config, from another helper, or from the user, but I doubt anybody is clamoring for that feature). So I think we're OK here, unless you can come up with any more obscure case. - are there cases that match now that didn't before? Those are a bit dangerous become they may mean unexpectedly sharing credentials with hosts the user didn't mean to. Obviously anything with a "*" in it will behave differently, but I think that's OK (i.e,. I doubt anybody wrote "*" in a hostname and didn't mean it for it to glob). One interesting one is that the credential code currently requires a full path match instead of a prefix match. So: echo url=https://example.com/foo/bar | GIT_TERMINAL_PROMPT=0 git \ -c credential.https://example.com/foo.helper='!echo >&2 run helper' \ credential fill doesn't currently trigger the helper, but does with your patch. I think this is _probably_ OK. I think the existing behavior was done to just always err on the side of strictness in matching, with the thought that nobody cared that much about path matching anyway (by default we ignore the paths completely). - The rules for ordering are a bit different, in that the credential code takes any matches in the order in which they appear in the config file. Whereas urlmatch won't pass on any less-specific matches. So doing: echo url=https://example.com/foo | GIT_TERMINAL_PROMPT=0 git \ -c credential.https://example.com/foo.helper='!echo >&2 path helper' \ -c credential.https://example.com.helper='!echo >&2 bare-domain helper' \ credential fill right now will trigger both helpers (path then bare-domain), but after your patch will just trigger the path helper. Curiously, if you flip the order of the two config keys, it will still run both. That's because urlmatch assumes that the receiver is handling the "last one wins" semantics, but the credential code is actually building a list of helpers. Weirder still, the current strategy for single-valued items like credential.*.username is "first one wins". So: echo url=https://example.com/foo | git \ -c credential.https://example.com.username=domain-user \ -c credential.https://example.com/foo.username=path-user \ credential fill will prompt for the password of domain-user, but flipping it will ask for path-user. Which seems...weird. That actually doesn't change with your patch, because we'll either send both to the credential callback (in the order above) or we'd just send the first one (if the order is flipped). I'd be in favor of changing that to the usual "last one wins", because it makes no sense and is unlike the rest of Git's config. However, I think the reason it's written that way is so that: echo url=https://foo@xxxxxxxxxxx | git \ -c credential.https://example.com.username=bar \ credential fill still takes the username=foo from the command-line. We'd want to take care not to break that case, which implies remembering whether c->username came from a previous config value, or was part of the initial credential lookup. That doesn't _technically_ have to be part of your patch, because your patch doesn't make it any worse than it is now. But it would be nice to cleanup (since urlmatch gives us nice "most specific wins" semantics). But I do think we need to deal with the helper-list behavior as part of your patch, because it _does_ regress. And while the example above is probably fairly contrived, I think urlmatch is responsible for matching credential.helper, too (as the least specific possible match), and it's not outrageous to have something like: [credential] helper = cache [credential "https://example.com"] helper = !some-host-specific-helper Some options are: - teach urlmatch to pass matching config keys to our callback even if they're "worse" than a previously-seen key, so that we can then record all helpers in the order they appear in the config (retaining the existing behavior) - use urlmatch's cmp_matches() to order the list of helper. This would be a change in behavior, but I wonder if it might be what people prefer. I suspect it would make some happy (if the host-specific helper can answer the query above, you'd just as soon not run the cache helper at all) and others not (if the host-specific one is expensive or requires user interaction, you may want to try the cache first). So I'm not sure if it would be a good idea or not. A few comments on the patch itself: > --- a/Documentation/gitcredentials.txt > +++ b/Documentation/gitcredentials.txt > @@ -131,7 +131,9 @@ context would not match: > because the hostnames differ. Nor would it match `foo.example.com`; Git > compares hostnames exactly, without considering whether two hosts are part of > the same domain. Likewise, a config entry for `http://example.com` would not > -match: Git compares the protocols exactly. > +match: Git compares the protocols exactly. However, you may use wildcards in > +the domain name and other pattern matching techniques as with the `http.<url>.*` > +options. You'd probably want to review the documentation to accommodate any of the behavior changes discussed above that we end up with. > static void credential_apply_config(struct credential *c) > { > + char *normalized_url; > + struct urlmatch_config config = { STRING_LIST_INIT_DUP }; Here we initialize a string_list, but I don't think we ever free it. It looks like the urlmatch_config_entry() callback will stash items here for its "more specific match" checks. I think we need: string_list_clear(&config.vars, 1); which builtin/config.c:get_urlmatch() does. It also seems to free config.url.url, which is touched by the normalization process. I _think_ this is always the same as the return value normalized_url, so we'd be OK there. > + config.section = "credential"; > + config.key = NULL; > + config.collect_fn = credential_config_callback; > + config.cascade_fn = git_default_config; > + config.cb = c; I don't think the old code would ever call git_default_config (we _just_ want to load values for this specific URL). So I think you'd want to leave the cascade_fn NULL here? > + credential_describe(c, &url); > + normalized_url = url_normalize(url.buf, &config.url); The purpose of credential_describe() so far has been to show the URL to the user. It won't do any %-encoding that would be required for more exotic URLs. But I assume we'd want that in whatever we feed to url_normalize. So for example: echo url=https://example.com/%2566 | GIT_TERMINAL_PROMPT=0 git \ -c credential.https://example.com/%2566.helper='!echo >&2 run helper' credential fill matches in the current code, but not after your patch (we decode %25 into just "%", and then feed "%66" to the url normalizer, which decodes it to "f". -Peff