On Tue, Oct 11, 2022 at 3:20 PM Glen Choo <chooglen@xxxxxxxxxx> wrote: > > Jacob Keller <jacob.keller@xxxxxxxxx> writes: > > > On Tue, Oct 4, 2022 at 11:12 AM Glen Choo <chooglen@xxxxxxxxxx> wrote: > >> > >> Hi Jacob! Thanks for the report! > >> > > > > Thanks for responding! > > :) > > >> Or, if you could include a reproduction script, that would be really > >> helpful :) > >> > > > > I'm not sure how to do this, because it is only an intermittent > > failure. I suspect it has to do with when the submodule actually needs > > to update. > > > > Perhaps I can come up with something though. If I can, I'll send it as > > a new test. > > That would be greatly appreciated, thanks! > > If you find code pointers useful, > > - builtin/submodule--helper.c:fetch_in_submodule() contains the logic > for fetching during "git submodule update" > > - submodule.c:fetch_submodules() contains the logic for fetching during > "git fetch --recurse-submodules" (which is invoked by "git pull > --recurse-submodules"). > I was able to get a test highlighting the failure. It shows the case of a single remote working but adding another remote causes it to fail as it falls back to the 'origin'. > > >> > > >> > remote: Enumerating objects: 210, done. > >> > remote: Counting objects: 100% (207/207), done. > >> > remote: Compressing objects: 100% (54/54), done. > >> > remote: Total 210 (delta 123), reused 197 (delta 119), pack-reused 3 > >> > Receiving objects: 100% (210/210), 107.20 KiB | 4.29 MiB/s, done. > >> > Resolving deltas: 100% (123/123), completed with 48 local objects. > >> > From <redacted> > >> > ... > >> > Fetching submodule submodule > >> > From <redacted> > >> > 85e0da7533d9..80cc886f1187 <redacted> > >> > Fetching submodule submodule2 > >> > fatal: 'origin' does not appear to be a git repository > >> > fatal: Could not read from remote repository. > >> > > >> > Please make sure you have the correct access rights > >> > and the repository exists. > >> > Errors during submodule fetch: > >> > submodule2 > >> > >> I assume this is `git fetch` running in the superproject? > >> > > > > Its git pull --rebase, but I suppose as part of this it will run > > something equivalent to git fetch? > > Unfortunately, this doesn't narrow it down much because "git pull > --recurse-submodules" runs _both_ "git fetch --recurse-submodules" _and_ > "git submodule update [--rebase]" ;) Without more context, it's not > clear which of those is failing. > It's definitely "git fetch --recurse-submodules", the new test should show this. > >> When fetching with `git fetch`, submodules are fetched without > >> specifying the remote name, which means Git guesses which remote you > >> want to fetch from, which is documented at > >> https://git-scm.com/docs/git-fetch. I believe (I haven't reread this > >> very closely) this is, in order: > >> > >> - The remote of your branch, i.e. the value of the config value > >> `branch.<name>.remote` > > > > So basically if its checked out to a branch it will fetch from the > > remote of that branch, but... > > > >> - origin > >> > > > > It defaults to origin, so if you have the usual "checked out as a > > detached head" style of submodule, it can't find the remote branch. > > Yes, this sounds about right. I was quite certain that we only default > to "origin", but I observe that "git fetch" doesn't fail if there is > only one remote and it is not named "origin". Perhaps I'm mistaken, or I > simply couldn't track down that logic. > We definitely default to the single/lone remote, I have two tests, one which shows the single remote working and another which shows the additional remote causing the failure. > >> But... I'll mention another wrinkle for completeness' sake (though I > >> don't think it applies to you). If you fetch using `git submodule > >> update`, the submodule is fetched using a _named_ remote, specifically: > >> > >> - If the superproject has a branch checked out, it uses the name of the > >> superproject branch's remote. > > > > Right, so that explains why I can re-run git submodule update after a > > git pull --rebase and it works. > > > > In theory wouldn't it make more sense to use the remote based on the > > URL of the .gitmodules file? > > Ah, yes that's one possibility we (the folks working on an improved > Submodules UX) have considered. Another would be to teach submodules to > actually use branches correctly and to use the remotes of the branches. > Yes, if we can have it checkout on a branch and just rewind that branch to match the expected commit instead of having it in a detached state, things would be much easier. I recall work being done on this years ago, but it is quite a thorny problem. > In general, the project tries not to respect config coming directly from > .gitmodules (c.f. [1]), but I agree that there's a lot of room for > improvement. > Right. I think I'd rather go with a config option inside the .git/config [submodule] section. I don't think gitmodules itself needs to know this, just that the parent project could be informed of what remote to default to when fetching inside the submodule. That or somehow unify the git submodule update code with the recursive fetching? > [1] https://lore.kernel.org/git/xmqq35bze3rr.fsf@gitster.g > > >> - If the superproject does not have a branch checked out, it uses > >> "origin". > >> > > > > I suppose one option would be to make this configurable. I started > > using "upstream" as the default remote name for most of my > > repositories when I began working with forks a lot more. > > My hope is that the work I mentioned earlier makes this code obsolete > and nobody ever has to configure this ;) > Yea. I definitely like the idea of using branches instead of a detached head state. I think for now I can avoid this by just disabling recursive fetch in my config, which at least gets around the problem well enough. Another alternative I thought was maybe "try to fetch every remote" instead of trying to fetch only a single remote? > > > >> > > >> > Thanks, > >> > Jake