Re: [PATCH 2/2] config: allow specifying config entries via envvar pairs

Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> · Wed, 18 Nov 2020 14:44:59 +0100

On Wed, Nov 18 2020, Jeff King wrote:

> On Tue, Nov 17, 2020 at 03:22:05PM +0100, Ævar Arnfjörð Bjarmason wrote:
>
>> > then I'd feel comfortable making it a public-facing feature. And for
>> > most cases it would be pretty pleasant to use (and for the unpleasant
>> > ones, I'm not sure that a little quoting is any worse than the paired
>> > environment variables found here).
>> 
>> I wonder if something like the git config -z format wouldn't be easier,
>> with the twist that we'd obviously not support \0. So we'd need an
>> optional length prefix. : = unspecified.
>> 
>>     :user.name
>>     Jeff K
>>     :alias.ci
>>     commit
>>     :10:bin.ary
>>     <10 byte string, might have a \n>
>>     :other.key
>>     Other Value
>> 
>> Maybe that's overly fragile, or maybe another format would be better.
>
> Yeah, length-delimited strings are an alternative that some people think
> is less error-prone than quoting. And we do use pkt-lines. They're also
> a pain for humans to write (it's nicer if they're optional, but when you
> _do_ have to start using them, now you are stuck counting things up).
>
>> I was trying to come up with one where the common case wouldn't
>> require knowing about shell quoting/unquoting, and where you could
>> still do:
>> 
>>     GIT_CONFIG_PARAMETERS=":my.new\nvalue\n$GIT_CONFIG_PARAMETERS"
>> 
>> Or equivalent, and still just keep $GIT_CONFIG_PARAMETERS as-is to pass
>> it along.
>> 
>> Your "do not require quoting" accomplishes that, and it's arguably a lot
>n
> Looks like your mail got cut off.

Nothing important, probably :)

> But yeah, the goal of making the quoting optional was to make it
> easier for humans to use for simple cases. It doesn't help at all with
> other programs inserting values, which can just as easily err on the
> side of caution.
>
> BTW, there is another problem with GIT_CONFIG_PARAMETERS (and "git -c"
> in general). The dotted config-key format:
>
>   section.subsection.key
>
> is unambiguous by itself, even though "subsection" can contain arbitrary
> bytes, including dots. Because neither "section" nor "key" can contain
> dots, we can parse from either end, and take the whole middle as a
> subsection (and this is how we do it in the code).
>
> But an assignment string like:
>
>   section.subsection.key=value
>
> _is_ ambiguous. We have to parse left-to-right up to the first equals
> (since "value" can contain arbitrary characters, including an equals).
> But "subsection" can have one, too, so we want to parse right-to-left
> there. E.g., in:
>
>   one.two=three.four=five
>
> this could be either of:
>
>   - section is "one", key is "two", value is "three.four=five"
>
>   - section is "one", subsection is "two=three", key is "four", value is
>     "five"
>
> We currently always parse it as the former (which I think is least-bad
> of the two, since values are more likely than subsections to contain
> arbitrary text with an equals).

Yeah, it's a pain to parse if it's on one line. FWIW that's the main
reason for why the format I suggested moved it to \n-delimited, because
keys can't contain an \n, so you can unambiguously have them be
\n-delimited (as git config -z does).

You do need to worry about a \n in the value, but for the common case
where you don't have a \n there we wouldn't need to provide the length.

Or just provide tooling as you suggested in
<20201118015907.GD650959@xxxxxxxxxxxxxxxxxxxxxxx>, which I like better
than any one format suggestion (including the one I suggsted). I.e. we
can document that:

 - The variable exists
 - You read/write/add to it using a return value from this tool

Which allows for keeping the value itself opaque and open to a future
change.