Hi Glen, On Fri, May 06, 2022 at 06:30:10PM +0000, Glen Choo via GitGitGadget wrote: > From: Glen Choo <chooglen@xxxxxxxxxx> > > Add a config variable, `safe.barerepository`, that tells Git whether or > not to recognize bare repositories when it is trying to discover the > repository. This only affects repository discovery, thus it has no > effect if discovery was not done (e.g. `--git-dir` was passed). Thanks for working on this! I'm excited to see some patches here, though I'm not totally convinced of this direction. More below. To summarize, this proposal attempts to work around the problem of embedding bare repositories in non-bare checkouts by providing a way to opt-out of bare repository discovery (which is to only discover things that are listed in the safe.bareRepository configuration). I agree that this would prevent the problem you're trying to solve, but I have significant concerns that this patch is going too far (at the risk of future damage to unrelated workflows) in order to accomplish that goal. My concern is that if we ever flipped the default (i.e. that "safe.bareRepository" might someday be ""), that many legitimate cases of using bare repositories would be broken. I think there are many such legitimate use cases that _do_ rely on discovering bare repositories (i.e., Git invocations that do not have a `--git-dir` in their command-line). One such example would be forges, but I imagine that there are many other uses we don't even know about, and I would like to avoid breaking those if we ended up changing the default. If it's possible to pursue a more targeted fix that leaves non-embedded bare repositories alone, I'd like to try and focus these efforts on a more narrow fix that would address just the case of embedded bare repositories. I think that the direction I outlined in: https://lore.kernel.org/git/Ylobp7sntKeWTLDX@nand.local/ could be a good place to start (see the paragraph beginning with "Here's an alternative approach" and below for the details). One potential problem with that approach (that this patch doesn't suffer from) is that any discovery which finds a bare repository would have to continue up to the root of the volume in order to figure out whether or not that bare repository is embedded in another non-bare one. That is probably a non-starter due to performance, but I think you could easily work around with a top-level setting that controls whether or not you even _care_ about embedded bare repositories. For example, if I set safe.bareRepository='*' in my top-level /etc/gitconfig, then we can avoid having to continue discovery for bare repositories altogether because we know we'll allow it anyway. To pursue a change that targets just embedded bare repositories, I think you fundamentally have to do an exhaustive repository discovery in order to figure out whether the (bare) repository you're dealing with is embedded or not. So having an opt-out for users that either (a) don't care or (b) can't accept the performance degradation that Emily mentioned as a result of doing unbounded filesystem traversal would be sensible. Playing devil's advocate for a moment, though, even if we had something like the proposal I outlined, flipping the top-level default from '*' to some value that implies we stop working in embedded bare repositories will break existing workflows. But that breakage would just be limited to embedded bare repositories, and not non-embedded ones. So I think on balance that breakage would affect fewer real-world users, while still being just as easy to recover from. > safe.barerepository is presented to users as an allow-list of > directories that Git will recognize as a bare repository during the > repository discovery process (much like safe.directory), but this patch > only implements (and permits) boolean behavior (i.e. on, off and unset). > Hopefully, this gives us some room to discuss and experiment with > possible formats. > > Thanks to Taylor for suggesting the allow-list idea :) I did suggest an allow-list, but not this one ;-). > I think the core concept of letting users toggle bare repo discovery is > solid, but I'm sending this as RFC for the following reasons: > > * I don't love the name safe.barerepository, because it feels like Git > is saying that bare repos are unsafe and consequently, that bare repo > users are behaving unsafely. On the other hand, this is quite similar > to safe.directory in a few respects, so it might make sense for the > naming to reflect that. Yes, the concerns I outlined above are definitely echoing this sentiment. Another way to say it is that this feels like too big of a hammer (i.e., it is targeting _all_ bare repositories, not just embedded ones) for too small of a nail (embedded bare repositories). As you're probably sick of hearing me say by now, I would strongly prefer a more targeted solution (perhaps what I outlined, or perhaps something else, so long as it doesn't break non-embedded bare repositories if/ever we decided to change the default value of safe.bareRepository). > * The *-gcc CI jobs don't pass. I haven't discerned any kind of pattern > yet. Interesting. I wouldn't expect this to be the case (since the default is to allow everything right now). > * In the longer-term, we might identify a usable-enough default that we > can give opt-out protection that works for the vast majority of > users. Perhaps, and I think if this were the case then I would feel differently about this patch. But I don't want us to paint ourselves into a corner, either. It would be unfortunate to, say, find ourselves in a position where the only protection against some novel embedded bare repository attack is to change a default that would break many existing workflows for _non_-embedded bare repositories. > = Other questions/Concerns > > * Maybe it's more informative for the user if we die() (or warn()) when > we find a bare repo instead of silently ignoring it? We should definitely provide more feedback to the user. If I set `safe.bareRepository` to the empty string via a global config, and then execute a Git command in a non-embedded bare repository, I get: $ git.compile config --get --global --default='*' safe.bareRepository $ git.compile rev-parse --absolute-git-dir fatal: not a git repository (or any of the parent directories): .git whereas on the last release of Git, I get instead: $ git rev-parse --absolute-git-dir /home/ttaylorr/repo.git I'm still not convinced that just reading repository extensions while ignoring the rest of config and hooks is too confusing, so I'd be more in favor of something like: $ git.compile rev-parse --absolute-git-dir warning: ignoring repository config and hooks advice: to permit bare repository discovery (which advice: will read config and hooks), consider running: advice: advice: $ git config --global --add safe.bareRepository /home/ttaylorr/repo.git /home/ttaylorr/repo.git (though I still feel strongly that we should pursue a more targeted approach here). Thanks, Taylor