On 5/13/2022 7:37 PM, Glen Choo via GitGitGadget wrote: > From: Glen Choo <chooglen@xxxxxxxxxx> > > Add a config variable, `discovery.bare`, that tells Git whether or not > it should work with the bare repository it has discovered i.e. Git will > die() if it discovers a bare repository, but it is not allowed by > `discovery.bare`. This only affects repository discovery, thus it has no > effect if discovery was not done (e.g. `--git-dir` was passed). > This config is an enum of: > > - ["always"|(unset)]: always recognize bare repositories (like Git does > today) > - "never": never recognize bare repositories > > More values are expected to be added later, and the default is expected > to change (i.e. to something other than "always"). I think it is fine to include the "never" option for users to opt-in to this super-protected state, but I want to make it very clear that we should never move to it as a new default. This phrasing of 'something other than "always"' is key, but it might be good to point out that "never" is very unlikely to be that default. > WIP setup.c: make discovery.bare die on failure > > Signed-off-by: Glen Choo <chooglen@xxxxxxxxxx> Accidental concatenation of squashed commit? > diff --git a/Documentation/config/discovery.txt b/Documentation/config/discovery.txt > new file mode 100644 > index 00000000000..761cabe6e70 > --- /dev/null > +++ b/Documentation/config/discovery.txt > @@ -0,0 +1,24 @@ > +discovery.bare:: > + Specifies what kinds of directories Git can recognize as a bare > + repository when looking for the repository (aka repository > + discovery). This has no effect if repository discovery is not > + performed e.g. the path to the repository is set via `--git-dir` > + (see linkgit:git[1]). Avoid "e.g." here. This has no effect if the repository is specified directly via the --git-dir command-line option or the GIT_DIR environment variable. > +This config setting is only respected when specified in a system or global > +config, not when it is specified in a repository config or via the command > +line option `-c discovery.bare=<value>`. We are sprinkling config options that have these same restrictions throughout the config documentation. It might be time to define a term like "protected config" at the top of git-config.txt and then refer to that from these other locations. > +The currently supported values are `always` (Git always recognizes bare > +repositories) and `never` (Git never recognizes bare repositories). This sentence structure is likely to change in the future, and as it stands will become complicated. A bulleted list will have easier edits in the future. > +This defaults to `always`, but this default is likely to change. For now, I would say "but this default may change in the future." instead. > +If your workflow does not rely on bare repositories, it is recommended that > +you set this value to `never`. This makes repository discovery easier to > +reason about and prevents certain types of security and non-security > +problems, such as: > + (You might need a "+" here.) > +* `git clone`-ing a repository containing a malicious bare repository > + inside it. > +* Git recognizing a directory that isn't meant to be a bare repository, > + but happens to look like one. I think these last bits recommending the 'never' option are a bit distracting. It doesn't make repository discovery "easier to reason about" because we still discover the bare repo and die() instead of skipping it and looking higher for a non-bare repository in the parent directories. The case of an "accidentally-recognized bare repo" is so unlikely it is probably not worth mention in these docs. Instead, I think something like this might be better: If you do not use bare repositories in your workflow, then it may be beneficial to set `discovery.bare` to `never` in your global config. This will protect you from attacks that involve cloning a repository that contains a bare repository and running a Git command within that directory. > +static int check_bare_repo_allowed(void) > +{ > + if (discovery_bare_config == DISCOVERY_BARE_UNKNOWN) { > + read_very_early_config(discovery_bare_cb, NULL); This will add the third place where we use read_very_early_config(), adding to the existing calls in tr2_sysenv_load() and ensure_valid_ownership(). If I understand it correctly, that means that every Git execution in a bare repository will now parse the system and global config three times. This doesn't count the check for uploadpack.packobjectshook in upload-pack.c that uses current_config_scope() to restrict its value to the system and global config. We are probably at the point where we need to instead create a configset that stores this "protected config" and allow us to lookup config keys directly from that configset instead of iterating through these config files repeatedly. > + /* We didn't find a value; use the default. */ > + if (discovery_bare_config == DISCOVERY_BARE_UNKNOWN) > + discovery_bare_config = DISCOVERY_BARE_ALWAYS; This could also be done in advance of the config parsing by setting discovery_bare_config = DISCOVERY_BARE_ALWAYS before calling read_very_early_config(). Avoids an if and a comment here, which might be nice. > + } > + switch (discovery_bare_config) { > + case DISCOVERY_BARE_NEVER: > + return 0; > + case DISCOVERY_BARE_ALWAYS: > + return 1; > + default: > + BUG("invalid discovery_bare_config %d", discovery_bare_config); > + } You return -1 in discovery_bare_cb when the key matches, but the value is not understood. Should we check the return value of read_very_early_config(), too? > +static const char *discovery_bare_config_to_string(void) > +{ > + switch (discovery_bare_config) { > + case DISCOVERY_BARE_NEVER: > + return "never"; > + case DISCOVERY_BARE_ALWAYS: > + return "always"; > + default: > + BUG("invalid discovery_bare_config %d", discovery_bare_config); In general, I'm not sure these BUG() statements are helpful, but they aren't hurting anything. I wonder if it would be better to use DISCOVERY_BARE_UNKNOWN instead of default, because then the compiler should notice that the switch needs updating when a new enum mode is added. > @@ -1142,7 +1195,8 @@ enum discovery_result { > GIT_DIR_HIT_CEILING = -1, > GIT_DIR_HIT_MOUNT_POINT = -2, > GIT_DIR_INVALID_GITFILE = -3, > - GIT_DIR_INVALID_OWNERSHIP = -4 > + GIT_DIR_INVALID_OWNERSHIP = -4, > + GIT_DIR_DISALLOWED_BARE = -5 I think that you can add a comma at the end of this enum to avoid the changed line the next time the enum needs to be expanded. > }; > > /* > @@ -1239,6 +1293,8 @@ static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir, > } > > if (is_git_directory(dir->buf)) { > + if (!check_bare_repo_allowed()) > + return GIT_DIR_DISALLOWED_BARE; Won't this fail if someone runs a Git command inside of a .git/ directory for a non-bare repository? I just want to be sure that we hit this error instead: fatal: this operation must be run in a work tree I see that this error is tested in t0008-ignores.sh, but that's with the default "always" value. It would be good to explicitly check that this is the right error when using the "never" config. Thanks, -Stolee