On Thu, Apr 14 2022, Glen Choo wrote: > Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> writes: > >> On Thu, Apr 07 2022, Derrick Stolee wrote: >> >>> A more complete protection here would be: >>> >>> 1. Warn when finding a bare repo as a tree (this patch). >>> >>> 2. Suppress warnings on trusted repos, scoped to a specific set of known >>> trees _or_ based on some set of known commits (in case the known trees >>> are too large). >>> >>> 3. Prevent writing a bare repo to the worktree, unless the user provided >>> an opt-in to that behavior. >>> >>> Since your patch is moving in the right direction here, I don't think >>> steps (2) and (3) are required to move forward with your patch. However, >>> it is a good opportunity to discuss the full repercussions of this issue. >> >> Isn't a gentler solution here to: >> >> 1. In setup.c, we detect a repo >> 2. Walk up a directory >> 3. Do we find a repo? >> 4. Does that repo "contain" the first one? >> If yes: die on setup >> If no: it's OK >> >> It also seems to me that there's pretty much perfect overlap between >> this and the long-discussed topic of marking a submodule with config >> v.s. detecting it on the fly. > > Your suggestion seems similar to: > > == 3. Detect that we are in an embedded bare repo and ignore the embedded bare > repository in favor of the containing repo. > > which I also think is a simple, robust mitigation if we put aside the > problem of walking up to the root in too many situations. I seem to > recall that this problem has come up before in [1] (and possibly other > topics? I wasn't really able to locate them through a cursory search..), > so I assume that's what you're referring to by "long-discussed topic". Yes, I mean the submodule.superprojectGitDir topic. > (Forgive me if I'm asking you to repeat yourself yet another time) I > seem to recall that we weren't able to reach consensus on whether it's > okay for Git to opportunistically walk up the directory hierarchy during > setup, especially since There are some situations where this is > extremely expensive (VFS, network mount). I'm not sure, but I think per the later https://lore.kernel.org/git/220204.86pmo34d2m.gmgdl@xxxxxxxxxxxxxxxxxxx/ and https://lore.kernel.org/git/220311.8635joj0lf.gmgdl@xxxxxxxxxxxxxxxxxxx/ that any optimization concerns were likely just "this is slow in shellscript" and not at the FS level. There were also passing references to some internal Google-specific NFS-ish implementation that I know nothing about (but you might), i.e. what I asked about in: https://lore.kernel.org/git/220212.864k53yfws.gmgdl@xxxxxxxxxxxxxxxxxxx/ But given the v9 superprojectGitDir becoming a boolean instead of a path in v9 I'm not sure/have no idea. The only thing I'm sure of is if past iterations of the series were addressing such a problem as an optimization that doesn't seem to be a current goal. As noted in those past exchanges I have tested this method on e.g. AIX whose FS is unbelievably slow, and I couldn't even tell the differenc. That's because if you look at the total FS syscalls even for an uninitialized repo just traversing .git, getting config etc. is going to dwarf "walking up" in terms of number of calls. Of course not all calls are going to be equal, and there's that potential "I'm not NFS-y, but a parent is" case etc. In any case, I think even *if* we had such a case somewhere that this plan would still make sense. Such users could simply set GIT_CEILING_DIRECTORIES or something similar if they cared about the performance. But for everyone else we'd do the right thing, and not prematurely optimize. I.e. we actually *are* concerned not with "does it look like a bare repo?" but "is this thing that looks like a bare repo within our current actual repo or not?". > I actually like this option quite a lot, but I don't see how we could > implement this without imposing a big penalty to all bare repo users - > they'd either be forced to set GIT_DIR or GIT_CEILING_DIRECTORIES, or > take a (potentially big) performance hit. Hopefully I'm just framing > this too narrowly and you're approaching this differently. As noted in the [1] you quoted (link below) I tried to quantify that potential penalty, and it seems to be a complete non-issue. Of course there may be other scenarios where it matters, but I haven't seen any concrete data to support that. Doesn't pretty everyone who cares about the performance of bare in any capacity do so because they're running a server that's using git-upload-pack and the like? Those require you to specify the exact .git directory you want. I.e. wouldn't this *only* apply to those doing the equivalent of "git -C some-dir" to "cd" to a bare repo? > PS: As an aside, wouldn't this also break libgit2? We could make this > opt-out behavior, though that requires us to read system config _before_ > discovering the gitdir (as I discussed in [2]). No it wouldn't? I don't use libgit2, but upthread there's concern that banning things that look-like-a-repo from being tracked would break it. Whereas I'm pointing out that we don't need to do that, we can just keep searching upwards. But yes, it would "break" anything that assumed you could cd to that tracked-looks-like-or-is--a-gitdir and have e.g. "git config" pick up its config instead of our "real repo" config, but that's exactly what we want in this case isn't it? I'm just pointing out that we can do it on the fly in setup.c, instead of forbidding such content from ever being tracked within the repository, which we'd be doing because we know we're doing the wrong thing in that setup.c codepath. Let's just fix that bit in setup.c instead. > [1] https://lore.kernel.org/git/211109.86v912dtfw.gmgdl@xxxxxxxxxxxxxxxxxxx/ > [2] https://lore.kernel.org/git/kl6lv8vc90ts.fsf@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx