On Tue, 2016-03-01 at 03:35 -0500, Jeff King wrote: > On Mon, Feb 29, 2016 at 07:52:34PM -0500, David Turner wrote: > > > Usually, git calls some form of setup_git_directory at startup. > > But > > sometimes, it doesn't. Usually, that's OK because it's not really > > using the repository. But in some cases, it is using the repo. In > > those cases, either setup_git_directory_gently must be called, or > > the > > repository (e.g. the refs) must not be accessed. > > It's actually not just setup_git_directory(). We can also use > check_repository_format(), which is used by enter_repo() (and hence > by > things like upload-pack). I think the rule really ought to be: if we > didn't have check_repository_format_gently() tell us we have a valid > repo, we should not access any repo elements (refs, objects, etc). I'll change that commit message to say "check_repository_format_gently". > > diff --git a/builtin/grep.c b/builtin/grep. > [snip: this is a probably-good behavior change] Agreed. > My fix for this was to teach read_mailmap to avoid looking for > HEAD:.mailmap if we are not in a repository, but to continue with the > others (.mailmap in the cwd, and the mailmap.file config variable). > ... > But I do think your patch is a potential regression there, if anybody > does do that. Your version sounds better. But I don't see it in the patch set you sent earlier? > > diff --git a/git.c b/git.c > > index 6cc0c07..51e0508 100644 > > --- a/git.c > > +++ b/git.c > > @@ -376,7 +376,7 @@ static struct cmd_struct commands[] = { > > { "am", cmd_am, RUN_SETUP | NEED_WORK_TREE }, > > { "annotate", cmd_annotate, RUN_SETUP }, > > { "apply", cmd_apply, RUN_SETUP_GENTLY }, > > - { "archive", cmd_archive }, > > + { "archive", cmd_archive, RUN_SETUP_GENTLY }, > > { "bisect--helper", cmd_bisect__helper, RUN_SETUP }, > > { "blame", cmd_blame, RUN_SETUP }, > > { "branch", cmd_branch, RUN_SETUP }, > > I didn't have to touch this case in my experimenting. I wonder if > it's > because I resolved the "grep" case a little differently. > > I taught get_ref_cache() to only assert() that we have a repository > when > we are looking at the main ref-cache, not a submodule. In theory, we > can > look at a submodule from inside an outer non-repo (it's not really a > submodule then, but just a plain git dir). I don't think there's > anything in git right now that says you can't do so, though I think > your > refs-backend work does introduce that restriction (because it > actually > requires the submodules to use the same backend). > > So with that requirement, I think we do need to require a repo even > to > access submodule refs. Is that what triggered this change? No. What triggered this change was a test failure with your earlier patch on master -- none of my stuff at all. The failing command was: git archive --remote=. HEAD When writing my patch, I had assumed that the issue was the resolve_ref on the HEAD that's an argument -- but it's not. The actual traceback is: #0 die ( err=err@entry=0x57ddb0 "BUG: resolve_ref called without initializing repo") at usage.c:99 #1 0x00000000004f7ed9 in resolve_ref_1 (sb_refname=0x7c4a50 <sb_refname>, sb_contents=0x7fffffffcfc0, sb_path=0x7fffffffcfe0, flags=0x7fffffffdaaa, sha1=0x7fffffffd100 "\b\326\377\377\377\177", resolve_flags=5572384, refname=0x2 <error: Cannot access memory at address 0x2>) at refs/files-backend.c:1429 #2 resolve_ref_unsafe (refname=refname@entry=0x550b3b "HEAD", resolve_flags=resolve_flags@entry=0, sha1=sha1@entry=0x7fffffffd100 "\b\326\377\377\377\177", flags=flags@entry=0x7fffffffd0fc) at refs/files-backend.c:1600 #3 0x00000000004ffe69 in read_config () at remote.c:471 #4 0x0000000000500235 in read_config () at remote.c:705 #5 remote_get_1 (name=0x7fffffffdaaa ".", get_default=get_default@entry=0x4fe230 <remote_for_branch>) at remote.c:688 #6 0x00000000005004ca in remote_get (name=<optimized out>) at remote.c:713 #7 0x00000000004159d8 in run_remote_archiver (name_hint=0x0, exec=0x550720 "git-upload-archive", remote=<optimized out>, argv=0x7fffffffd608, argc=2) at builtin/archive.c:35 #8 cmd_archive (argc=2, argv=0x7fffffffd608, prefix=0x0) at builtin/archive.c:104 #9 0x0000000000406051 in run_builtin (argv=0x7fffffffd608, argc=3, p=0x7bd7a0 <commands+96>) at git.c:357 #10 handle_builtin (argc=3, argv=0x7fffffffd608) at git.c:540 #11 0x000000000040519a in main (argc=3, av=<optimized out>) at git.c:671 > I'd think you would need a matching line inside cmd_archive, too. It > should allow "--remote" without a repo, but generating a local > archive > does need one. And indeed, I see in write_archive() that we run > setup_git_repository ourselves, and die if we're not in a git repo. > So > I'm puzzled about which code path accesses the refs. I agree that --remote should work without a repo, It seems that we do n't test this and we probably should. I'm not sure what the right way to fix this is -- in read_config, we're about to access some stuff in a repo (config, HEAD). It's OK to skip that stuff if we're not in a repo, but we don't want to run setup_git_directory twice (that breaks some stuff), and some of the other callers have already called it. On top of your earlier repo_initialized patch, we could add the following to read_config: + if (!repo_initialized) { + int nongit = 0; + setup_git_directory_gently(&nongit); + if (nongit) + return; + } But that patch I think was not intended to be permanent. Still, it does seem odd that there's no straightforward way to know if the repo is initialized. Am I missing something? > > diff --git a/test-match-trees.c b/test-match-trees.c > But your patch here is certainly the right thing to do if we are > keeping it. Let's keep it for now; we could always remove it later. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html