Re: [PATCH] clone: teach --single-branch and --branch during --recurse

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jan 27, 2020 at 08:31:39PM -0500, Jeff King wrote:
> On Mon, Jan 27, 2020 at 05:08:41PM -0800, Emily Shaffer wrote:
> 
> > > Yeah, I do still think that it makes sense for clone to pass along
> > > --single-branch, regardless, and then deal with branch selection problem
> > > separately on top.
> > 
> > Sure; I've got that ready to send shortly. It seems to grab HEAD of the
> > remote for each submodule, and then checkout the specific commit ID the
> > superproject wants - in my test case, that commit ID was a direct
> > ancestor of 'master', so the single branch only got 'master'. I'm not
> > sure how it would work with a commit ID which doesn't exist in the
> > single branch that was fetched; I'll write a test and have a look.
> 
> Yeah, it's definitely worth exploring how that works. I thought we had
> some kind of fallback for when we didn't manage to fetch the object. But
> maybe I am confusing it with the fallback for "we tried to fetch this
> specific object, but the other side doesn't allow that, so we grabbed a
> branch instead".

Ok, so I gave it a try. Some well-trimmed trace output:

1) git clone --recurse-submodules --single-branch <url> (the branch in
question is remote's HEAD)
 - Normal clone of superproject
 - git submodule--helper update-clone --progress --require-init
   --single-branch --
 - ultimately...
 - git clone --no-checkout --progress --separate-git-dir
   '/.../super_clone/.git/modules/sub'
   --single-branch --
   '/path/to/submodule/source'
   '/path/to/submodule/destination'
 - git checkout -f -q <ID of submodule's HEAD>

2) git clone --recurse-submodules --single-branch --branch other <url>
  'other' points to a commit of 'sub' which is not an ancestor of 'sub''s
  current HEAD.
 - Normal clone of superproject identical to 1)
 - git submodule--helper update-clone --progress --require-init
   --single-branch --
 - ultimately...
 - git clone --no-checkout --progress --separate-git-dir
   '/.../super_clone/.git/modules/sub'
   --single-branch --
   '/path/to/submodule/source'
   '/path/to/submodule/destination'
 - git fetch origin <ID of submodule's other branch>
 - git checkout -f -q <ID of submodule's other branch>

So, somewhere in the submodule machinery, it looks like we check if we
have the commit in question, and if not, we do another fetch. So in this
case, we reach to the server twice per submodule.

On the bright side, it doesn't fall over; on the dim side, I'd think
we could ask for this ref up front along with whatever branch HEAD is.
I thought there was a way we could tell the server we want 'master' as
well as '58c34ed'?

> 
> > > So for the simple case, you probably do want to be able to say "use this
> > > branch for cloning all submodules".
> > 
> > I think it still makes sense to call this out explicitly, yes? Or do you
> > think that should just be the default?
> 
> Yes, I think it should be a separate option from "--branch".
> 
> > This made me think - I wonder if it makes sense to take
> > --submodule-branch as a wildcarded spec instead. So in your case, I
> > could say,
> > 
> >   git clone --submodule-branch *=devel -b devel superproject
> > 
> > And then I don't need to do anything differently for 'git fetch' later.
> > This also opens the door for some repos getting special treatment:
> > 
> >   git clone --submodule-branch-file=foo.txt -b dev example
> > 
> >   foo.txt:
> >   curl=stable-1.2.3
> >   nlohmann=v2.28
> >   example-*=dev
> >   *=master
> 
> If we write it all as config, I think things may get simpler. IIRC,
> there is already submodule.*.foo config in .git/config (that can mirror
> and override what's in .gitmodules).

Hm. But at clone time, there is no .git/config yet, which is why I
proposed a file passed in at the command line. Although it does seem to
make sense to write down those preferences in the .git/config after.

I guess you could pass in configs at the command line, though, and then
you don't have to massage it to write your config after fetch.

> So if we had some config option for "clone this branch for the submodule
> instead of HEAD", then that means you can do:
> 
>   git clone -c submodule.foo.clonehead=devel ...
> 
> and the result would be used by the submodule code, but also saved for
> future invocations. Likewise, if there's no "clonehead" config for a
> particular submodule, if we fall back to submodule.defaultclonehead,
> then you could do:
> 
>   git clone -c submodule.defaultclonehead=devel ...
> 
> and it would also be saved as the default for future submodules.  And
> all without having to invent a new submodule-branch-file format.
> 
> The name "clonehead" isn't great.

Au contraire - it might be my new go-to insult. ;)

> I'm not sure if this ought to be submodule.*.branch (since I don't
> quite know what that's used for). I think you'll have to explore that
> a bit.
> 
> > I think that also tends to match the glob-expansion configs we use for
> > other things. One thing sticking out to me about the idea of providing
> > --submodule-branch is that you need to know what's in the repo before
> > you clone it the first time, which being able to use globbing like this
> > kind of helps with. But then, I suppose if you don't know what you're
> > looking for, you're not also looking for a very precise filter on your
> > clone ;)
> 
> Yeah; the scheme I outlined above only allow specifying the value for
> one submodule, or the fallback default. It wouldn't allow arbitrary
> globbing. But I also suspect nobody wants that. If you know what the
> submodules are, then you can set up config for each. If you don't, then
> "everything" is the only glob that makes sense.

Yeah, I suspect you're right and this fancy globbing falls under YAGNI.

 - Emily



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux