On Tue, Apr 26 2022, Derrick Stolee wrote: [Snip and just focusing on the "what to do with config" part of this]: > Question: Is "protected" config _really_ more trustworthy? > ---------------------------------------------------------- > > This leads to an interesting question: Do we think that `~/.gitconfig` > and `/etc/gitconfig` are "more trustworthy" than `.git/config`? > > I think that if an attacker has access to write to system or global config, > then they have access to do more harm without Git. On the other hand, > local config overrides the other config values, so local config can "unset" > any potentially-malicious values in system and global config. I don't > think such "defensive config" is common. I think this is a subset of areas where we rightfully piggy-back on FS permissions, and should continue to do so. I.e. we should trust /etc/gitconfig and the like because someone who can modify it can modify /usr/bin/git anyway, so trying to defend against anything in that area is pointless. Yes we allow overriding that config, but that should be thought of as a convenience, not as a nascent security boundary. > Example Security Boundary Question: Unprotected Config and Executing Code > ------------------------------------------------------------------------- > > We have a lot of ways of executing external code from within Git. This is > a key extensibility feature for customizing Git functionality. One way to > do this is to create executable files in the $GIT_DIR/hooks/ directory. > Git will discover these hooks and run them at the appropriate times. > > There are also many config options that specify external commands to run: > > * `core.fsmonitor=<path>` is executed before scanning the worktree for > created or modified files. > > * `core.editor` specifies an editor when writing commit messages, among > other user-input scenarios. > > * `credential.helper` specifies an external tool to assist with connecting > to a remote repository. > > * `uploadPack.packObjectsHook` chooses an alternative for `git pack-objects` > during `git upload-pack`. > > The list is actually quite long. This last one, `uploadPack.packObjectsHook` > _does_ do a check for protected config: it does not allow its value to > come from repository-local config. > > However, most of these options really do want to have customization on a > per-repository basis, hence this proliferation of config options and > local hook directories. > > I'm concerned that as long as we allow arbitrary execution to happen based > on repository-local data, we will always have security risks in Git. For > that reason, I'm interested in exploring ways to change how we execute > code externally, even if it means we would (eventually) need to introduce > a breaking change. I agree that we should mainly be thinking of these config values that directly execute something external, but as elaborated on below I think any security solution that narrowly focuses only on these is on the wrong path. You can e.g. point git-send-email to hostile server, or disable its SSL/TLS settings with config. Ditto HTTP settings to disable certificate checking etc. etc. You can also set transfer.fsckObjects=false or one of the fsck.* settings and open the door to fetching a payload which propagates a known part CVE. But more below. > The idea would be to allow repository-local customization by selecting > from a list of "installed" executables. The list of "installed" > executables comes from protected config (and the Git binary itself). Most of this type of config doesn't point to a path to an executable, but is a string we'll give to "sh -c" or equivalent. E.g. for editors we couldn't naively add "emacs" to such a whitelist, as it supports command-line options to evaluate arbitrary code. How would your plan handle such cases? > The plan I would like to put forward is to restrict all external execution > to be from one of these sources: > > 1. Specified in system config (`/etc/gitconfig`). > 2. Specified in global config (`~/.gitconfig`). > 3. An allow-list of known tools, on $PATH (e.g. `vim`). > > Such a change would be a major one, and would require changing a lot of > existing code paths. In particular, this completely removes the > functionality of the `$GIT_DIR/hooks/` directory in favor of a config- > based approach. This would require wide communication before pulling all > support for the old way, and likely a 3.0 version designation. After the > old hook mechanism is removed, the "safe.directory" protection from 2.35.2 > would no longer be needed. Aside from any of the details of safe.directory & how we discover hook it was my understanding per [1] that Johannes Schindelin disagreed with this assessment of what safe.directory was for. I.e. now the known vector is a hook, but in the previous off-list discussions I'd proposed narrowing the new safe.directory error down to handle that hook case only, but per the "cat being out of the bag" in [1] there was concern about other non-hook issues being found. Perhaps that assessment has changed, just noting it here for completeness. In any case, I don't think that we'd need to make the removal of $GIT_DIR/hooks support in favor of config-based hooks a dependency of any such proposal. The current config-based hook proposal would allow you to exhaustively emulate $GIT_DIR/hooks by defining all our hooks to those paths. Therefore any security mechanism could surely consider the old $GIT_DIR/hooks to be handled equivalently to however it would handle that sort of config-based hooks configuration. > At minimum, in the short term, this would affect the proposed design of > config-based hooks. > > I think this is a good example to think about at a high level before going > into the technical details. We can use it to test any proposed security > boundary definitions to see if it lands on one side or another. Here are > some points: > > 1. These config-based executables cannot be set in a full repository by > a "git clone" operation. > > 2. These config-based executables can be set in an embedded bare > repository, but the user needs to move deeper into the repository for > that to have any affect. This leads to some amount of social engineering > being involved in the attack. See [4] for recent discussion on this. > > [4] https://lore.kernel.org/git/Ylobp7sntKeWTLDX@nand.local/ > > 3. If users are sharing a common Git repository, then if an attacker gains > control of one user's account, they can use the shared repository as an > attack vector to gain control of the other users' accounts. For this > case, do we consider the "safe.directory" config as an indicator of > "I trust this repo and all users that can access it _in perpetuity_" or > instead "I need to use this repo, even though it is owned by someone > else." > > 4. Git's dedication to backwards compatibility might prevent any attempt > to change how hooks are specified or config can be used to customize > its behavior. > > 5. The technical challenge of converting all possible execution paths may > be too daunting to ever feel the project is "complete" and be able to > confidently say "Git is safe from these kinds of attacks". > > > Conclusion > ---------- > > I look forward to hearing from the community about this topic. There are > so many things to think about in this space, and I hope that a lot of > voices can share their perspectives here. > > Please collect any action items here at the end. I would love to add a > doc to the Git tree that summarizes our conclusions here, even if it is > only a start to our evolving security boundary. This didn't make it on-list, but in the off-list discussion about safe.directory I pointed out that this class of problem is something Emacs has been dealing with for decades, and which we'd do well to try to emulate. [2] below is the relevant part of my <220303.861qzi3mag.gmgdl@xxxxxxxxxxxxxxxxxxx> (sent on Thu, 03 Mar 2022 19:27:59 +0100), I also mentioned it in passing in [3]. The brief overview for it in Emacs's documentation is available here: https://www.gnu.org/software/emacs/manual/html_node/emacs/Safe-File-Variables.html I feel strongly that something like that is a much better direction to go in than an approch that tries to narrowly classify only "dangerous" config. That sort of approach would basically do the reverse. We'd whitelist "safe" config (e.g. diff.orderFile or whatever), and ask the user if they're OK with using config that falls outside of the whitelisting. By classifying our own config (and we'd probably need more than just "safe" and "executes arbitrary code") the common case is that users shouldn't need to answer those questions, as we'd know that the config is safe. This would be implemented by having a config mechanism "mark" which area(s) of config are "safe". We'd only pay attention to such a config from sources that area already "safe". So, to begin with this addresses cases where e.g. a tool like git-annex will execute arbitrary executables based on git configuration, which any mechanism that marks only config git itself knows about won't be able to do (it uses its own config space). But it also extends this mechanism from being something *just* focused on narrowly addressing security to something generally useful. E.g. even if a repository on disk has "safe" config I might still want to say that I don't want to use its alias.* config or whatever. Whatever mechanism we end up with here, I think that now that the dust has settled on the CVE we'd do well to consider those sorts of problems. A core.editor pointing to "rm -rf /" is a security issue, but any such issues are just subsets of annoying third-party config doing something I didn't ask for. 1. https://lore.kernel.org/git/220412.86h76yglfe.gmgdl@xxxxxxxxxxxxxxxxxxx/ 2. I know I've mentioned this before, including at past dev meetings, but I think that people interested in our whole config / repo reading ACL and "post-ACL" story should really look at how Emacs solves this problem. Not just "yeah yeah, Ævar's blathering about Emacs again" :). It really is highly topical here, especially to the point that Johannes & Stolee are discussing about university setups. It's exactly the sort of setups it was aimed at. The problem space is essentially the same. In brief summary. Emacs's implementation language and its "config format" are one and the same thing. It loads it from ~, and it can load it from a directory you're looking at, or even the file, or over the network. For those familiar with "vim" it would be like its "expandtab" etc. "file variables" were really arbitrary C code that it would compile and dynamically load. Scary, I know. I think what it does to do that safely is basically the same sort of security model we should be aiming for in git when it comes to reading the config format. In a roundabout way it's doing the same thing: potentially executing arbitrary code. I wrote about this a while ago in the context of loading an in-repo (as in part of the repo content, a .gitconfig like we have .gitattributes) here: https://lore.kernel.org/git/87zi6eakkt.fsf@xxxxxxxxxxxxxxxxxxx/; I've mentioned it a few times on-list after, following that breadcrumb trail should lead to relevant discussions: https://lore.kernel.org/git/?q=%2F87zi6eakkt.fsf%40evledraar.gmail.com The tl;dr is that we'd move to a whitelisting-based approach where (for example, the details could be tweaked): 1. You could define in ~/.gitconfig (or other trusted "this is really the user's") what files/paths you trust. 2. Anything not on the list we'd either ignore or prompt you the first time about. 3. When you "trust" a "foreign" source we'd default to a narrow whitelist, i.e. saying "this config defines these 5 variables at these values". Is it OK to load those from this path? If the user says "yes" we'd write those answers to the ~/.gitconfig for future reference. I.e. these config values at this path are OK. 4. We'd provide a way to run git commands that normally read config in a way that completely ignores config, or that only trust "safe" config (e.g. user.{name,email}, but nothing/little else). This could be used for git-prompt.bash, git-completion.bash etc.) 5. We could (perhaps by default) prompt you for the first time you read config from a repo. Doing a "git clone" or "git init" would add a "this is OK", but otherwise asking would cover e.g. wget-ing a tarred-up git repo whose config is malicious. Moving that sort of model would definitely break things, but hopefully in a way that wouldn't be too disruptive. 3. https://lore.kernel.org/git/220413.86fsmgeq15.gmgdl@xxxxxxxxxxxxxxxxxxx/