Re: [PATCH 2/2] diff: teach diff to read gitattribute diff-algorithm

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Feb 07 2023, Jeff King wrote:

> On Sun, Feb 05, 2023 at 03:46:21AM +0000, John Cai via GitGitGadget wrote:

> From the user's perspective, this is weirdly inconsistent with the
> existing diff attributes, which would be more like:
>
>   # in .gitattributes
>   *.json diff=json 
>
>   # in config
>   [diff "json"]
>   algorithm = histogram

That does look more elegant.

> I know why one might choose the scheme you did; it kicks in if the repo
> sets the algorithm, without users having to set up any extra config.
> Which is sort of nice, if we assume that malicious actors don't have any
> incentive to pick the algorithm. In theory they don't, though I saw Ævar
> mention possible DoS elsewhere in the thread.
>
>   Side note: It's also possible that algorithm selection could be
>   required to trigger a separate security bug (say, a buffer overflow in
>   the patience code or something), so restricting that works in a
>   belt-and-suspenders way. But that somehow feels like like the wrong
>   side of the paranoia-vs-feature line.
>
> So I dunno. I recognize that this scheme fulfills your immediate needs
> better, but I fear that we'll be stuck with a weird split between "diff"
> and "diff-*" attributes forever. In the long run, having a way for the
> repo to say "and here is some config I recommend to you" would give you
> the best of both, but that is a challenging topic that has been
> discussed and punted on for many years.

If (and I'm not taking a stance on whether this is the case here) we
think that a hypothetical .gitconfig in-repo combined with
.gitattributes would be more elegant for this or other cases, I don't
see why the path to some very limited version of that where we read
literally one whitelisted config variable from such a .gitconfig, and
ignore everything else, wouldn't be OK.

I.e. the more general in-repo .gitconfig discussion (of which there was
some discussion this past Git Merge in Chicago) includes potentially
much harder things

Like what to do with changing genreal in-repo config, if and how to
compare that against some local whitelist of what sort of config we
should trust (or none) etc. etc.

But if we agreed that we'd be willing to trust the remote with some
config-by-another-name unconditionally, as is being proposed here, we
could easily read that one thing from an in-repo .gitconfig, ignore
everything else, and then just document that interface as a
special-case.

We have several such "limited config" readers already, and the relevant
bits of the config parser already have to handle arbitrary "git config"
files in-tree, due to .gitmodules.

We even have special-snowflake config in the configspace already that
doesn't obey the usual rules: The trace2.* config is only read from the
--system config, as it has an inherent bootsrtapping problem in wanting
to log as early as possible.

The advantage of the limited in-repo .gitconfig would be that if we
think the interface was more elegant this would be a way to start small
in a way that would be future-proof as far as the permanent UX goes.

If there's interest the read_early_config(), read_very_early_config()
and gitmodules_config_oid() functions are good places to look.

The last one in particular could be generalized pretty easily to read a
limited in-tree .gitconfig.

Although plugging it into the "config order" might be tricky, I haven't
tried.

Even a minimal implementation of such a thing would probably want a "git
config --repository" (or whatever we call the flag) to go with
"--local", "--system" etc, to spew out something sensible from
"--show-origin" etc.




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux