On Thu, Jun 30 2022, Glen Choo via GitGitGadget wrote: > This is a quick re-roll to address Ævar's comments on the tests (thanks!). Thanks! > = Description Just more generally on this series & approach. I know this is a v6 by now, but I haven't kept up with this topic, but to be fair I did mention pretty much this in: https://lore.kernel.org/git/220407.86lewhc6bz.gmgdl@xxxxxxxxxxxxxxxxxxx/ So... > There is a known social engineering attack that takes advantage of the fact > that a working tree can include an entire bare repository, including a > config file. A user could run a Git command inside the bare repository > thinking that the config file of the 'outer' repository would be used, but > in reality, the bare repository's config file (which is attacker-controlled) > is used, which may result in arbitrary code execution. See [1] for a fuller > description and deeper discussion. > > This series implements a simple way of preventing such attacks: create a > config option, discovery.bare, that tells Git whether or not to die when it > finds a bare repository. discovery.bare has two values: > > * "always": always allow bare repositories (default), identical to current > behavior > * "never": never allow bare repositories > > and users/system administrators who never expect to work with bare > repositories can secure their environments using "never". discovery.bare has > no effect if --git-dir or GIT_DIR is passed because we are confident that > the user is not confused about which repository is being used. I'm not insisting that the entire approach here should be changed, but in the above exchange you seemed to have performance concerns about the "just walk up in setup.c" approach I mentioned, but it's not clear if that's still the only thing that necessitates taking this approach. There may be security subtleties that I've missed, but from the description here it seems like that would work equally well, and wouldn't require configuration, except insofar as we'd need to opt-in to reading config from bare repositores *that also exist in a parent tree*. And it would be a more narrow & more secure solution, since it would e.g. allow you to intentionally navigate to /var/repos/git/git.git in a server setup and read the config there, which it could distinguish from a case of /var/repos/.git existing, and git/git.git being brought in as a part of that "parent" repo. The "more narrow" and "more secure" go hand-in-hand, since if you work on such servers you'd turn this to "always" because you want to read such config, but then be left vulnerable to the actual (and muche rarer) exploit we're trying to prevent. Which, it seems... > This series does not change the default behavior, but in the long-run, a > "no-embedded" option might be a safe and usable default [2]. "never" is too > restrictive and unlikely to be the default. This series has (since v3?) been noting aspirations to have a "no-embedded" variant of this config, which your 5/5 here notes would be better, but isn't implemented by this series. But your 5/5 also notes: but detecting if a repository is embedded is potentially non-trivial, so this work is not implemented in this series. Hrm, well, the diff-stat isn't quite that trivial either :) : > [...] > upload-pack.c | 27 ++++++---- > 12 files changed, 304 insertions(+), 47 deletions(-) In threads linked from the above ML link I linked to some POC code showing how to hack a second .git discovery walk into setup.c. This was as part of the "submodule parent dir" proposal, which is a different feature, but also needs such "find the parent" code: https://lore.kernel.org/git/211109.86v912dtfw.gmgdl@xxxxxxxxxxxxxxxxxxx/ Now, obviously that's a dirty hack, but it's not that hard to just change the part of setup.c where we're satisfied that we've found the git dir, then walk up "$THAT_DIR/..", and start our search again. Then: if (first_dir_was_bare() && found_parent_dir()) enforce_no_embedded(); Isn't that what your proposed "no embedded" option would need to do? Well, maybe we'd also check if the "first dir" is in the index of the parent, as opposed to just being a bare .git somewhere in ~/Downloads, e.g. if you have a ~/.git and keep your dot-files in git. But I think for an initial implementation just doing the walk would be good enough, and would have a more narrow scope than this configuration setting. AFAICT the performance concerns aren't supported by any data, in the case of the "submodule superproject" feature it turned out to not be the directory walk, but us shelling out in a loop in git-submodule.sh. Well, *maybe* that's not the case, I think I have managed to read between the lines of some of these past exchanges that there's some odd propriterary internal NFS-like setup at Google where *parent dirs* are auto-mounted and searched on access, so a "walk up" pattern would be much more expensive. I do worry a bit about us ending up with design choices in git that we wouldn't have ended up with, if not to cater to some in-house setup somwhere that 99.99% of git users will never see. But I don't have the full picture on the "submodule superproject" problem, or this one, and maybe I'm missing something. Just food for thought, and wondering where we're eventually taking this. Thanks!