Re: [PATCH v6 0/5] config: introduce discovery.bare and protected config

Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> · Fri, 01 Jul 2022 01:07:54 +0200

On Thu, Jun 30 2022, Glen Choo via GitGitGadget wrote:

> This is a quick re-roll to address Ævar's comments on the tests (thanks!).

Thanks!

> = Description

Just more generally on this series & approach. I know this is a v6 by
now, but I haven't kept up with this topic, but to be fair I did mention
pretty much this in:
https://lore.kernel.org/git/220407.86lewhc6bz.gmgdl@xxxxxxxxxxxxxxxxxxx/

So...

> There is a known social engineering attack that takes advantage of the fact
> that a working tree can include an entire bare repository, including a
> config file. A user could run a Git command inside the bare repository
> thinking that the config file of the 'outer' repository would be used, but
> in reality, the bare repository's config file (which is attacker-controlled)
> is used, which may result in arbitrary code execution. See [1] for a fuller
> description and deeper discussion.
>
> This series implements a simple way of preventing such attacks: create a
> config option, discovery.bare, that tells Git whether or not to die when it
> finds a bare repository. discovery.bare has two values:
>
>  * "always": always allow bare repositories (default), identical to current
>    behavior
>  * "never": never allow bare repositories
>
> and users/system administrators who never expect to work with bare
> repositories can secure their environments using "never". discovery.bare has
> no effect if --git-dir or GIT_DIR is passed because we are confident that
> the user is not confused about which repository is being used.

I'm not insisting that the entire approach here should be changed, but
in the above exchange you seemed to have performance concerns about the
"just walk up in setup.c" approach I mentioned, but it's not clear if
that's still the only thing that necessitates taking this approach.

There may be security subtleties that I've missed, but from the
description here it seems like that would work equally well, and
wouldn't require configuration, except insofar as we'd need to opt-in to
reading config from bare repositores *that also exist in a parent tree*.

And it would be a more narrow & more secure solution, since it would
e.g. allow you to intentionally navigate to /var/repos/git/git.git in a
server setup and read the config there, which it could distinguish from
a case of /var/repos/.git existing, and git/git.git being brought in as
a part of that "parent" repo.

The "more narrow" and "more secure" go hand-in-hand, since if you work
on such servers you'd turn this to "always" because you want to read
such config, but then be left vulnerable to the actual (and muche rarer)
exploit we're trying to prevent.

Which, it seems...

> This series does not change the default behavior, but in the long-run, a
> "no-embedded" option might be a safe and usable default [2]. "never" is too
> restrictive and unlikely to be the default.

This series has (since v3?) been noting aspirations to have a
"no-embedded" variant of this config, which your 5/5 here notes would be
better, but isn't implemented by this series.

But your 5/5 also notes:

    but detecting if a repository is embedded is potentially
    non-trivial, so this work is not implemented in this series.

Hrm, well, the diff-stat isn't quite that trivial either :) :

> [...]
>  upload-pack.c                       | 27 ++++++----
>  12 files changed, 304 insertions(+), 47 deletions(-)

In threads linked from the above ML link I linked to some POC code
showing how to hack a second .git discovery walk into setup.c. This was
as part of the "submodule parent dir" proposal, which is a different
feature, but also needs such "find the parent" code:
https://lore.kernel.org/git/211109.86v912dtfw.gmgdl@xxxxxxxxxxxxxxxxxxx/

Now, obviously that's a dirty hack, but it's not that hard to just
change the part of setup.c where we're satisfied that we've found the
git dir, then walk up "$THAT_DIR/..", and start our search again.

Then:

	if (first_dir_was_bare() && found_parent_dir())
        	enforce_no_embedded();

Isn't that what your proposed "no embedded" option would need to do?
Well, maybe we'd also check if the "first dir" is in the index of the
parent, as opposed to just being a bare .git somewhere in ~/Downloads,
e.g. if you have a ~/.git and keep your dot-files in git.

But I think for an initial implementation just doing the walk would be
good enough, and would have a more narrow scope than this configuration
setting.

AFAICT the performance concerns aren't supported by any data, in the
case of the "submodule superproject" feature it turned out to not be the
directory walk, but us shelling out in a loop in git-submodule.sh.

Well, *maybe* that's not the case, I think I have managed to read
between the lines of some of these past exchanges that there's some odd
propriterary internal NFS-like setup at Google where *parent dirs* are
auto-mounted and searched on access, so a "walk up" pattern would be
much more expensive.

I do worry a bit about us ending up with design choices in git that we
wouldn't have ended up with, if not to cater to some in-house setup
somwhere that 99.99% of git users will never see.

But I don't have the full picture on the "submodule superproject"
problem, or this one, and maybe I'm missing something. Just food for
thought, and wondering where we're eventually taking this.

Thanks!