Re: [PATCH] hooks: propose repository owner configured hooks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Mar 19, 2021 at 3:27 AM Ævar Arnfjörð Bjarmason
<avarab@xxxxxxxxx> wrote:
>
>
> On Thu, Mar 18 2021, Albert Cui via GitGitGadget wrote:
>
> > We propose adding native Git functionality to allow project maintainers to
> > specify hooks that a user ought to install and utilize in their development
> > workflows. This patch documents the requirements we propose for this feature
> > as well as a design sketch for implementation.
>
> I like this goal. It's something I proposed (off-list) before and did a
> small write-up of here:
> https://lore.kernel.org/git/87zi6eakkt.fsf@xxxxxxxxxxxxxxxxxxx/
>
> As far as I recall the response in the room at the time was the expected
> combination of "sure that would be neat" and "let's see the
> patches". I.e. I don't think there's hard objections to the existence of
> such a facility, but of course the devel is in the details, and
> obviously I never followed-up with actually trying to implement it.
>
> With config-based hooks this'll be much easier for the hook case, and
> I've tried to help that along[A]. I'd be interested in reviewing any
> effort in this area as well. The ".githooks/*" case in that proposal
> goes away with config-based hooks (since they wouldn't be special
> anymore, just config).
>

Thanks for the context and the support. Agreed, I think this is a
natural evolution of config-based hooks.

[...]
> > +* As part of the fetch subcommand, Git prompts users to install the configs
> > +contained there.
> > +
> > +    ** User responses to that prompt could be "sticky" - e.g. a user could reply
> > +    "no (this time)", "no (never)", "yes (this time)", or "yes (always)".
> > +    Always/never indicate that the user trusts the remote this config is coming
> > +    from, and should not apply to configs fetched from other remotes.
>
> As noted in the proposal I linked I think anyone thinking about this
> would do well to examine the trade-off Emacs's implementation of this
> uses, since it manages to safely open arbitrary files that'll
> potentially run arbitrary code on-open, but only if the user opts-in.
>

In [0], I proposed only allowing pre-clone opt-in to suggested hooks
if allowlisted in the config for the given remote, which seems similar
to your previous proposal. Extending this idea to any config settings
seems very reasonable. I'd love other people's thoughts about this.

In your proposal, you wrote:

> It would work really well with includeIf, e.g. I could clone all my work repos to a "safe" area in ~/work which is allowed to set more options, e.g. aliases.

It seems much safer to do this for a given remote, opposed to a local
file path, no?

> > +Later, we might want to do this before the initial clone is performed; that
> > +workflow looks like:
> > +
> > +* During clone, perform ls-refs as normal
> > +
> > +* If the server has a "magic" config branch, fetch only that config branch.
>
> ...the reason for the magic branch is this before-clone use-case?
>
> Unless there's a really strong use-case for that I'd think the
> per-branch .gitconfig would be a better trade-off, but even then there
> would be ways to make that work obviously in the many-many case, and
> still e.g. have a branch advertise a config blob for its "main" branch
> as a special case or something.
>
> I also suspect an unstated constraint of having this in a magic branch
> is the limitation of some git hosting provider's ACL
> implementations. More on that later...
>

Mentioned in [0], the primary motivation for a magic config branch
that lived outside of the worktree was "since the configuration is
separate from the code base, it allows you to go back in history or to
older branches while preserving "improvements" to the hooks
configuration e.g. maybe the project has shifted to using a newer
version of a linter."

[...]
> > +Future Work
> > +~~~~~~~~~~~
> > +
> > +* Extending this to allow repository owners to specify specific configurations
> > +in general e.g. this repository should use partial-clone with these parameters.
>
> I don't see why such a proposal should narrowly confine itself to hooks.
>
> Once we have config-based hooks then hooks are just configuration.
>
> If we're going to pick up arbitrary configuration from remotes/repos on
> request then we'd be starting with the most dangerous configuration.
>

Summarizing from [0]: Based on this feedback, I'm hearing "we should
have a design for suggested configuration in general," but I don't
think that necessitates that we should pursue generic configuration
before hooks configuration.

> I think a better way to start such an effort incrementally would be:
>
> 1. Audit git's config parser for the safety of parsing arbitrary config,
>    we parse .gitmodules now, do we feel it's safe enough to parse
>    arbitrary config (probably, but worth checking).
>
> 2. Add reflection to git's own config variables. Right now we have this
>    in the binary generated via a grep from the documentation. Similar to
>    Emacs's implementation we should/could categorize all our known
>    config variables by safety.
>

To clarify, are you saying, today, git's config variables are pulled
from the documentation? I.e. the documentation is the source of truth
for what config variables are supported? o.0

>    Hooks are the least safe, something like a diff.context=N setting the
>    repo wants to suggest a -U<n> setting much safer (just tweaking our
>    existing diff algorithm) etc. But even in those cases we'd want to
>    proceed slowly, e.g. is that config parsing for that -U<n> defensive
>    enough to be safe for arbitrary data?
>

To clarify, this proposal is just to set the hooks config that
config-based hooks enabled e.g. running `git config --add
hook.pre-commit.command pylint` on behalf of the user, so I'm not sure
what "arbitrary data" you're referring to. At least, any problems I'd
think we'd already address with config-based hooks.

> 3. Users should be able to e.g. configure "yes, for any repo I clone
>    they can tweak 'safe'" variables. This would reduce user dialog
>    fatigue, and thus increase safety. I.e. a repo who wants to set a
>    thing like hooks would stand out, but something that e.g. wants to
>    change the diff order would pass on existing global configuration.
>
> 4. Smarter ACL that magic remote+magic branch: It seems like an obvious
>    thing to me to want that if I clone e.g. a random clone of git.git
>    that I'd want to setup config for it IFF the .gitconfig in it is
>    reachable from a tag GPG signed by <approved key>.
>
>    Git's a distributed system, so while I don't think it's bad to have
>    some feature like "I always trust config from this remote" (e.g. a
>    corporate environment where you know its .gitconfig is
>    guarded/audited) we should think about more distributed-friendly
>    solutions first and if possible guide users towards those.

This seems like an OK alternative to allow-listing based on remote,
but are you expecting users to add a GPG key to their .gitconfig?
Remote URLs seem much more user friendly (think IP address vs URL).

[0]: https://lore.kernel.org/git/CAMbkP-S-9cccMpU4HG0Wurqap-WkTmD2zk50nKd9kJ_oWO__qw@xxxxxxxxxxxxxx/




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux