On Mon, May 16, 2022 at 03:07:35PM -0400, Derrick Stolee wrote: > On 5/13/2022 7:37 PM, Glen Choo via GitGitGadget wrote: > > Thanks all for the comments on v1, I've expanded this series somewhat to > > address them,... > > Please include a full cover letter with each version, so reviewers > can respond to the full series goals. > > Your series here intends to start protecting against malicious > embedded bare repositories by allowing users to opt-in to a more > protected state. [...] Thanks for the summary, which I think will be especially helpful to others looking at this series for the very first time. > The 'never' option is a good one for very security-conscious > users who really want to avoid problems. I don't anticipate that > users who know about this option and set it themselves are the > type that would fall for the social engineering required to > attack using this vector, but I can imagine an IT department > installing the value in system config across a fleet of machines. When I first read this, I disagreed, since presumably that same crowd has legitimate bare repositories that they want to continue being able to operate in without having to pass `--git-dir` or `$GIT_DIR` in. In fact... > I also think that this protection is too rigid: it restricts > _all_ bare repositories, not just embedded ones. There is no check > to see if the parent directory of the bare repository is inside a > non-bare repository. ...this resonates quite a bit more with me. "never" isn't a good option unless you aren't a user of bare repositories _and_ don't have any embedded bare repositories (either at all, or any ones that you trust). > This leads to what I think would be a valuable replacement for > the 'cwd' option: > > * 'no-embedded' allows non-embedded bare repositories. An > _embedded bare repository_ is a bare repository whose parent > directory is contained in the worktree of a non-bare Git > repository. When in this mode, embedded bare repositories are > not allowed unless the parent non-bare Git repository has a > 'safe.embedded' config value storing the path to the current > embedded bare repository. > > That was certainly difficult to write, but here it is as > pseudo-code to hopefully remove some doubt as to how this might > work: > > if repo is bare: > if value == "always": > return ALLOWED > if value == "never": > return FORBIDDEN; This is indeed very similar to a proposal I had made upthread (which you note lower down in this email). One thing that's nice is that we only have to traverse up to the parent repo when in the "no-embedded" mode. That may be slow (since it's unbounded all the way up to the filesystem root or a ceiling directory, whichever we encounter first), but I think it's unavoidable if you need to distinguish between embedded and non-embedded bare repositories. > path = get_parent_repo() > > if !path: > return ALLOWED > > if config_file_has_value("{path}/.git/config", "safe.embedded", repo): > return ALLOWED > > return FORBIDDEN > > With this kind of option, we can protect users from these > social engineering attacks while providing an opt-in protection > for scenarios where embedded bare repos are currently being used > (while also not breaking anyone using non-embedded bare repos). > > I think Taylor was mentioning something like this in his previous > replies, perhaps even to the previous thread on this topic. Yep, see: https://lore.kernel.org/git/Ylobp7sntKeWTLDX@nand.local/.a > This 'no-embedded' option is something that I could see as a > potential new default, after it has proven itself in a released > version of Git. I would be totally happy to see "no-embedded" become the default. It might be nice to issue a warning when the top-level config is unset, to give users a heads up about cases that may be broken, perhaps like: if repo is bare: switch (value) { case "always": return ALLOWED; case "never": return FORBIDDEN; case "no-embedded": # fallthrough case "": path = get_parent_repo() if !path return ALLOWED; if config_file_has_value("{path}/.git/config", "safe.embedded", repo) return ALLOWED; if value == "no-embedded": return FORBIDDEN; # otherwise, we're in an embedded bare repository with an unset # discovery.bare config. # # warn that this will break in the future... warning(_("%s is embedded within %s"), the_repository.path, path); advise(_("to allow discovery for this embedded repo, either run")); advise(_("")); advise(_(" $ git config --global discovery.bare always, or")); advise(_(" $ git -C '%s' config --local safe.embedded '%s'"), path, relpath(path, the_repository.path)); # ...but allow the invocation for now until the default is # changed. return ALLOWED; default: die(_("unrecognized value of discovery.bare: '%s'"), value); } ...where relpath is similar to Go's path/filepath.Rel function. With an appropriate deprecation period, I think we could even get away from the "continue executing, but don't read config+hooks", which in retrospect is more error-prone and difficult to reason about than I initially had given it credit for. Thanks, Taylor