On Mon, Jan 27, 2020 at 06:10:07PM -0500, Jeff King wrote: > On Mon, Jan 27, 2020 at 02:49:14PM -0800, Emily Shaffer wrote: > > > > > - make sure that .gitmodules has enough information about which branch > > > > to use for each submodule > > > > > > Hum. I don't work with them day to day, but aren't we already in that > > > state? Is that not what the 'branch' option for each submodule means? > > > > I've been corrected off-list that the 'branch' in .gitmodules is used > > during 'git submodule update --remote', but not during 'git submodule > > init' or 'git clone --recurse-submodules'. Then, for the problem in > > discussion for this thread, it seems like a better choice is something > > like 'git clone --recurse-submdoules --use-gitmodules' or whatever we > > want to call it - e.g., rather than fetching the branch where the server > > knows HEAD, ask the .gitmodules to figure out which branch? > > Oof, I should have read this message before responding to the other one. ;) > > > It seems like that ought to live separately from --single-branch. In the > > case where you very strictly only want to fetch one branch (not two > > branches) I suppose you'd want something like 'git clone > > --recurse-submodules --single-branch --branch=mysuperprojectbranch > > --use-gitmodules' to make sure that only one branch per repo comes down. > > > > With n submodules of various naming schemas, provenance, etc., I don't > > think there's a good case for recursing --branch one way or another; it > > seems like filling out some config is the way to go. > > Yeah, I do still think that it makes sense for clone to pass along > --single-branch, regardless, and then deal with branch selection problem > separately on top. Sure; I've got that ready to send shortly. It seems to grab HEAD of the remote for each submodule, and then checkout the specific commit ID the superproject wants - in my test case, that commit ID was a direct ancestor of 'master', so the single branch only got 'master'. I'm not sure how it would work with a commit ID which doesn't exist in the single branch that was fetched; I'll write a test and have a look. > > > I guess we could also teach it to take some input like > > --submodule-branch-spec=foo.txt, and/or a multiply provided > > --submodule-branch foo=foobranch --submodule-branch bar/baz=bazbranch. > > > > [foo.txt] > > foo=foobranch > > bar/baz=bazbranch > > > > With that approach, then someone gets a little more flexibility than > > relying on what the .gitmodules has set up. > > Yeah, I agree that the most general form is being able to specify the > mapping for each individually. At first I wondered why you'd ever _not_ > want to just use the branches specified in .gitmodules. But I suppose > that gets embedded in the superproject history, which gets awkward as > those commits move between branches. E.g., for an android-like project, > you don't want to make a commit that says "use branch devel for all > submodules" on the devel branch of your superproject. Eventually that > will get merged to master, and you'd have to remember to switch it back > to "master". Yeah, or I suppose I might be doing something weird, like wanting to run integration tests for the whole project on changes in just one submodule, or something. > So for the simple case, you probably do want to be able to say "use this > branch for cloning all submodules". I think it still makes sense to call this out explicitly, yes? Or do you think that should just be the default? > > For the complex cases, you'd need that full mapping. But I think it may > be worth it to punt on that for now. Even if we eventually added such a > feature, I think we'd still want the simpler version anyway (because > when you _can_ use it, it's going to be much easier). So there's nothing > lost by doing the minimal thing now and waiting to see if more complex > use cases even show up. > > Another thing occurs to me, though: should the binding of this submodule > default branch be written to disk (e.g., a config option)? I'm thinking > specifically that if you do: > > git clone --submodule-branch=devel -b devel superproject > > and then later, you "git fetch" and find that somebody has added a new > submodule, you'd presumably want the devel branch of that, too? This made me think - I wonder if it makes sense to take --submodule-branch as a wildcarded spec instead. So in your case, I could say, git clone --submodule-branch *=devel -b devel superproject And then I don't need to do anything differently for 'git fetch' later. This also opens the door for some repos getting special treatment: git clone --submodule-branch-file=foo.txt -b dev example foo.txt: curl=stable-1.2.3 nlohmann=v2.28 example-*=dev *=master (specifying specific versions for some source dependencies, dev branches for submodules which are associated with with 'example' superproject and might be getting active development, and a wild guess for everything else) I think that also tends to match the glob-expansion configs we use for other things. One thing sticking out to me about the idea of providing --submodule-branch is that you need to know what's in the repo before you clone it the first time, which being able to use globbing like this kind of helps with. But then, I suppose if you don't know what you're looking for, you're not also looking for a very precise filter on your clone ;) - Emily