On Sat, Feb 5, 2022 at 3:42 AM Eric Sunshine <sunshine@xxxxxxxxxxxxxx> wrote: > > On Sat, Feb 5, 2022 at 6:23 AM Eric Sunshine <sunshine@xxxxxxxxxxxxxx> wrote: > > On Thu, Jan 27, 2022 at 3:03 PM Elijah Newren <newren@xxxxxxxxx> wrote: > > > Some comments on the various code changes: > > > * clone/push/fetch related: > > > * there are *many* subprocesses involved in fetch/push and friends, > > > and they typically need to know the GIT_DIR they are operating on > > > * this involves: fetch-patch.c, connected.c, bundle.c, clone.c, > > > transport-helper.c, receive-pack.c, upload-pack.c > > > * this accounts for the majority of this patch > > > > It does feel a bit like a bandaid to insert new code at these > > locations to set GIT_DIR manually. It's not clear to readers why > > GIT_DIR is needed for these specific cases, nor what the impact is > > when it is not set. Thus, one wonders if such a blanket approach is > > indeed required or if a more narrow and directed fix can be applied, > > such as calling subprograms with an explicit --git-dir= rather than > > setting GIT_DIR with its potentially more broad and > > difficult-to-reason-about impact. > > I meant to ask here what was the nature of the various failures you > were seeing without GIT_DIR being set, and whether you had considered > tackling those failures with --git-dir= instead of GIT_DIR. If so, > were the problems too difficult to overcome by --git-dir= alone? > Fleshing out the commit message with such information might be > worthwhile. clone, receive-pack, etc. will often spawn subprocesses such as repack, or index-pack. Those kinds of commands will not have been invoked from within a particular git directory, so discovery in the subprocess either doesn't find the git directory or finds the wrong one (e.g. testcases where you're in a repo and do `git clone . my-clone` would cause subprocesses to use discovery and find the outer git repo rather than the 'my-clone' they are supposed to be working on). I didn't look at just --git-dir=. I probably could have, but since I was just trying to get a feel for how big of a change it was, and there's the possibility of nested subprocesses (clone forking some http-fetch thingy, which forks an index-pack subprocess, etc.), so I just did the environment variable to save the work from diving down the hierarchy. I agree that if we want to make this kind of change, a better commit message would probably be in order, and perhaps trying out --git-dir= instead of setting the environment variable. However, after seeing the patch myself and thinking about it for another week, this just doesn't seem like a change that's worth it to me. If someone else wants to pick up the patch and run with it, they should feel free, but I'm not motivated enough to do so. > By the way, I also didn't mean to imply that the > "difficult-to-reason-about GIT_DIR interaction in relation to > subprograms" problem is new to this patch; it isn't. That problem has > been around for ages (at least since 2007-08-01) but it wasn't as > obvious since the setting of GIT_DIR was so far removed from other > code which runs subprograms, thus readers likely wouldn't be thinking > about GIT_DIR when reading the code which runs subprograms. This patch > only makes the problem more obvious since readers now see the > sequence: > > (1) set GIT_DIR > (2) launch subprogram > > So, a reader is more likely to wonder why GIT_DIR is needed for the > subprogram and what possible wider side-effects it might have. Yes, well put.