Re: [PATCH] clone: teach --single-branch and --branch during --recurse

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jan 27, 2020 at 06:10:07PM -0500, Jeff King wrote:
> On Mon, Jan 27, 2020 at 02:49:14PM -0800, Emily Shaffer wrote:
> 
> > > >   - make sure that .gitmodules has enough information about which branch
> > > >     to use for each submodule
> > > 
> > > Hum. I don't work with them day to day, but aren't we already in that
> > > state? Is that not what the 'branch' option for each submodule means?
> > 
> > I've been corrected off-list that the 'branch' in .gitmodules is used
> > during 'git submodule update --remote', but not during 'git submodule
> > init' or 'git clone --recurse-submodules'. Then, for the problem in
> > discussion for this thread, it seems like a better choice is something
> > like 'git clone --recurse-submdoules --use-gitmodules' or whatever we
> > want to call it - e.g., rather than fetching the branch where the server
> > knows HEAD, ask the .gitmodules to figure out which branch?
> 
> Oof, I should have read this message before responding to the other one. ;)
> 
> > It seems like that ought to live separately from --single-branch. In the
> > case where you very strictly only want to fetch one branch (not two
> > branches) I suppose you'd want something like 'git clone
> > --recurse-submodules --single-branch --branch=mysuperprojectbranch
> > --use-gitmodules' to make sure that only one branch per repo comes down.
> > 
> > With n submodules of various naming schemas, provenance, etc., I don't
> > think there's a good case for recursing --branch one way or another; it
> > seems like filling out some config is the way to go.
> 
> Yeah, I do still think that it makes sense for clone to pass along
> --single-branch, regardless, and then deal with branch selection problem
> separately on top.

Sure; I've got that ready to send shortly. It seems to grab HEAD of the
remote for each submodule, and then checkout the specific commit ID the
superproject wants - in my test case, that commit ID was a direct
ancestor of 'master', so the single branch only got 'master'. I'm not
sure how it would work with a commit ID which doesn't exist in the
single branch that was fetched; I'll write a test and have a look.

> 
> > I guess we could also teach it to take some input like
> > --submodule-branch-spec=foo.txt, and/or a multiply provided
> > --submodule-branch foo=foobranch --submodule-branch bar/baz=bazbranch.
> > 
> >   [foo.txt]
> >   foo=foobranch
> >   bar/baz=bazbranch
> > 
> > With that approach, then someone gets a little more flexibility than
> > relying on what the .gitmodules has set up.
> 
> Yeah, I agree that the most general form is being able to specify the
> mapping for each individually. At first I wondered why you'd ever _not_
> want to just use the branches specified in .gitmodules. But I suppose
> that gets embedded in the superproject history, which gets awkward as
> those commits move between branches. E.g., for an android-like project,
> you don't want to make a commit that says "use branch devel for all
> submodules" on the devel branch of your superproject. Eventually that
> will get merged to master, and you'd have to remember to switch it back
> to "master".

Yeah, or I suppose I might be doing something weird, like wanting to run
integration tests for the whole project on changes in just one
submodule, or something.

> So for the simple case, you probably do want to be able to say "use this
> branch for cloning all submodules".

I think it still makes sense to call this out explicitly, yes? Or do you
think that should just be the default?

> 
> For the complex cases, you'd need that full mapping. But I think it may
> be worth it to punt on that for now. Even if we eventually added such a
> feature, I think we'd still want the simpler version anyway (because
> when you _can_ use it, it's going to be much easier). So there's nothing
> lost by doing the minimal thing now and waiting to see if more complex
> use cases even show up.
> 
> Another thing occurs to me, though: should the binding of this submodule
> default branch be written to disk (e.g., a config option)? I'm thinking
> specifically that if you do:
> 
>   git clone --submodule-branch=devel -b devel superproject
> 
> and then later, you "git fetch" and find that somebody has added a new
> submodule, you'd presumably want the devel branch of that, too?

This made me think - I wonder if it makes sense to take
--submodule-branch as a wildcarded spec instead. So in your case, I
could say,

  git clone --submodule-branch *=devel -b devel superproject

And then I don't need to do anything differently for 'git fetch' later.
This also opens the door for some repos getting special treatment:

  git clone --submodule-branch-file=foo.txt -b dev example

  foo.txt:
  curl=stable-1.2.3
  nlohmann=v2.28
  example-*=dev
  *=master

(specifying specific versions for some source dependencies, dev branches
for submodules which are associated with with 'example' superproject and
might be getting active development, and a wild guess for everything
else)

I think that also tends to match the glob-expansion configs we use for
other things. One thing sticking out to me about the idea of providing
--submodule-branch is that you need to know what's in the repo before
you clone it the first time, which being able to use globbing like this
kind of helps with. But then, I suppose if you don't know what you're
looking for, you're not also looking for a very precise filter on your
clone ;)

 - Emily



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux