Re: [PATCH] urlmatch: create fetch.credentialsInUrl config

Derrick Stolee <derrickstolee@xxxxxxxxxx> · Wed, 25 May 2022 10:03:43 -0400

On 5/24/2022 5:01 PM, Ævar Arnfjörð Bjarmason wrote:
> 
> On Tue, May 24 2022, Derrick Stolee wrote:
> 
>> On 5/24/2022 4:18 AM, Ævar Arnfjörð Bjarmason wrote:
>>>
>>> On Mon, May 23 2022, Derrick Stolee via GitGitGadget wrote:
>>>
>>>> +fetch.credentialsInUrl::
>>>> +	A URL can contain plaintext credentials in the form
>>>> +	`protocol://<user>:<password>@domain/path`. Using such URLs is not
>>>> +	recommended as it exposes the password in multiple ways. The
>>>> +	`fetch.credentialsInUrl` option provides instruction for how Git
>>>> +	should react to seeing such a URL, with these values:
>>>
>>> Re the previous discussion about this (in the v1 patch /
>>> https://lore.kernel.org/git/pull.945.git.1619807844627.gitgitgadget@xxxxxxxxx/):
>>> In what ways?
>>>
>>> That's rhetorical, the point being: Let's adjust this documentation to
>>> discuss exactly why this is thought to be bad, what we're mitigating for
>>> the user etc., are there situations where running git like this is
>>> perfectly fine & not thought to be an issue? E.g. no password manager
>>> and you trust your FS permission? Let's cover those cases too.
>>
>> This documentation is not the proper place to tell the user "do this
>> and you can trust your plaintext creds in the filesystem" because that
>> is asking for problems. I'd rather leave a vague warning and let users
>> go against the recommended behavior only after they have done sufficient
>> work to be confident in taking on that risk.
> 
> I don't mean that we need to cover the full divergent views on different
> approaches to local password management, but not leave the user hanging
> with the rather scary "exposes the password in multiple ways".
> 
> I.e. if I read that for any software whose implementation I wasn't very
> familiar with I'd be very afraid, and in git's case for no reason.
> 
> Does in mean that git has some scary git-specific feature that would
> expose it. perhaps there's a local log that's unsecured where attempted
> URLs are logged, or perhaps we send the raw requested URL to the server
> so it can suggest alternatives for us. We do neither, but even a
> generally knowledgeable user won't know that about git in particular.
> 
> Whereas what I think you actually mean and are targeting here is better
> explained by:
> 
>     Git is careful to avoid exposing passwords in URLs on its own,
>     e.g. they won't be logged in trace2 logs. This setting is intended
>     for those who'd like to discourage (warn) or enforce (die) the use
>     of the password helper infrastructure over hardcoded passwords.
> 
> All of which I *think* is correct, but maybe I've missed something you
> know about, as that "in multiple ways" is doing a lot of work.
> 
> I also wonder if this wouldn't be even more useful if we took some
> lessons from ssh's book. I.e. per "git config -l --show-origin" we know
> the original of all config. We could be even more useful (and more
> aggressive about warning about) cases where we have passwords in config
> files that we detect don't have restrictive permissions, as OpenSSH does
> with your private key.
> 
> Ditto perhaps when the origin is "command line", as we do nothing to
> hide that from the process list on shared systems (and that would be
> racy whatever we did).

I tried to be careful about how "it" (being "Using such URLs") can
expose the password includes things that are not under Git's
responsibility (such as command-line histories and other system-level
logs) but I can add a bit about how Git stores the plaintext password
in the repository's config.

>>>> +	char *value = NULL;
>>>
>>> This init to NULL should be removedd, as we....
>>>
>>>> +	const char *at_ptr;
>>>> +	const char *colon_ptr;
>>>> +	struct strbuf anonymized = STRBUF_INIT;
>>>
>>> nit: Just call this "sb"? The's at least one line below over 79
>>> characters that's within the bounds with a shorter variable name, and in
>>> this case it's obvious what we're doing here...
>>
>> I will not change this name to be less descriptive.
> 
> Sure, just a suggestion. The other way is to just re-wrap that one
> line... :)
> 
> In the end I don't care, "just a nit", but just as one datapoint from
> reading this code: I find this varibale name in particular to be the
> polar opposite of descriptive, we're explicitly not anonymizing the URL
> in this function, since we're not stripping the username part.
> 
> Wouldn't descriptive be something more like uri_redacted_password or
> uri_no_password in this case?

How about "redacted"?

> Just for the implementation: It's slightly more wasteful, but in this
> case we don't care about performance, so perhaps a strbuf_splice()
> variant is easier here? I.e. add the full URL, find : and @, then
> strbuf_splice() it. It gets rid of much of the pointer juggling here &
> adding things incrementally.

TIL. strbuf_splice() will work perfectly. Thanks.

-Stolee