On Thu, Aug 01, 2019 at 12:13:19AM +0200, Johannes Schindelin wrote: > > This gets tricky, because some commands are intentionally avoiding the > > normal lookup procedure (e.g., clone or init, and probably things like > > upload-pack that want to enter another repo). So I think it is OK as > > long as the early-config code is explicitly saying "and please look at > > the refs in this specific direectory now", and it doesn't affect other > > possible code paths that might look at refs. I _think_ that's what > > you're suggesting above, but I just want to make sure (not that it > > matters either way for this patch). > > I think we already have the `git clone` problem with > `includeif.gitdir:`. AFAICT we _will_ discover a Git directory when > cloning inside an existing Git worktree. Yeah, I could well believe that. I think it's hard for the config code to say what's the right think to do here. If I'm running "git clone" from inside another repository, should I respect, say, an alias defined in that repository's config? Probably. But should I find that alias behind "includeif.gitdir"? I dunno. Maybe? So I'm not 100% sure the current behavior is buggy. And mostly I'd be happy to ignore it until somebody comes up with a compelling (real-world) example either way. > And as you say, there was no use case, and I would even contend that > there still is no use case. In the cover letter, I tried to concoct > something (using a branch-dependent pager) that sounds _really_ > far-fetched to even me. Yeah. I'd be totally fine if we left it with your fix here and nobody ever found time to work on this. :) > > > - const char *refname = resolve_ref_unsafe("HEAD", 0, NULL, &flags); > > > + const char *refname = !the_repository || !the_repository->gitdir ? > > > + NULL : resolve_ref_unsafe("HEAD", 0, NULL, &flags); > > > > I think the_repository is always non-NULL. > > No, it totally can be `NULL`. I know because my first version of the > patch did not have that extra check, and `git help -a` would segfault > outside a Git worktree when I had an `includeif.onbranch:` in my > `~/.gitconfig`. Hrm. But common-main calls initialize_the_repository(), which points it at &the_repo. And I can't find any other assignments. So how does it become NULL? And is every caller of have_git_dir() at risk of segfaulting? Ah, I see. I think it is that trace2 reads the configuration very early. I think we ought to do this: diff --git a/common-main.c b/common-main.c index 582a7b1886..89fd415e55 100644 --- a/common-main.c +++ b/common-main.c @@ -39,14 +39,14 @@ int main(int argc, const char **argv) git_resolve_executable_dir(argv[0]); + initialize_the_repository(); + trace2_initialize(); trace2_cmd_start(argv); trace2_collect_process_info(TRACE2_PROCESS_INFO_STARTUP); git_setup_gettext(); - initialize_the_repository(); - attr_start(); result = cmd_main(argc, argv); or possibly even move the trace2 bits to the very end of that function. The point of common-main is to do very basic setup. Doing tentative repo discovery and config reading there at all is surprising to me, to say the least. But I think we can at least make sure the library code is initialized first. > > The way similar sites check this is withV > > "!startup_info->have_repository" or have_git_dir(). The early-config > > code uses the latter, so we should probably match it here. > > Quite frankly, I'd rather not. At this point, it is not important > whether or not we discovered a Git directory, but whether or not we have > populated a dereference'able `the_repository`. Those are two different > things. What I'm concerned about it is whether there are cases where the_repository->gitdir is NULL, but we _could_ still look up refs. I.e., why is the rest of the config code using have_git_dir(), and why is this code path special? Again, I _think_ we might be able to get rid of have_git_dir() now. Back when it was introduced get_git_dir() did lazy setup, and these days it looks like it's just peeking at the_repository->gitdir. But it makes sense to me for this fix to be consistent with the surrounding code, and then to investigate have_git_dir() separately. > > Side note: I suspect there are some cleanup opportunities. IIRC, I had > > to add have_git_dir() to cover some cases where $GIT_DIR was set but > > we hadn't explicitly done a setup step, but there's been a lot of > > refactoring and cleanup in the initialization code since then. I'm not > > sure if it's still necessary. > > Yeah, well, I am not necessarily certain that we always ask the right > questions, such as asking whether we found a startup repository when we > need, in fact, to know whether `the_repository->refs` would cause a > segmentation fault because we would dereference a `NULL` pointer ;-) If there are cases where startup_info->have_repository is non-zero but we'd segfault, then I think that's a bug that is going to affect more spots than this, and we need to investigate and fix. But I don't think that is the case. We should only be setting it after calling set_git_dir(), and poking at the current sites which set that leads me to believe this is true. -Peff