Re: [PATCH 07/16] git-read-tree: take --submodules option

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

I gave the problem some more thought, and though I follow up on my previous
comment below, I can now see this:

So far it was discussed what should happen in fetch (+ checkout). But I think
the following are the interesting cases. Please read ALL before responding,
they are in somewhat random order:

 For following, assume there is a repo of project super has two branches,
 master and next. The next branch adds subproject sublib. In that state, I fetch
 refs/heads/*:refs/remotes/origin/* and check out master. So I have
 a repository, that contains revision with submodule, but I did not check it
 out yet.

 - Some time later, *without* fetching again, I simply
      git checkout --submodules -b feature1 remotes/origin/next
   Obviously it needs to give me the module.

 - Should that checkout work without network access?

 - Ok, now I start hacking on feature1 and find a bug in sublib, that I need
   to fix for it to work. Therefore I change something within sublib.

   However few days later I am asked to fix a bug in stable release of super.
   Therefore I: git checkout master Now, where does sublib go? It contains
   precious data!

   For the worst case situation assume, that the master branch also has
   directory sublib, so it can't stay where it was as unversioned.

 - The fix in master is done, back to our feature1, right?  git checkout
   --submodules feature1 Obviously re-fetching from upstram won't work. The
   head feature1 now refers to a commit that I made and only exists localy.

 - Now the maintainer of super wants to test the feature1. However sublib
   upstream did not accept the bugfix yet (and is perhaps waiting for
   confirmation, that the fixed version really works well for super, so we
   have to test).

   Therefore I push feature1 to my public repo, set up a public repo with my
   fixes to sublib and configure my public super repo to know about it.

   The maintainer already has a repo of super including sublib submodule. But
   when he pulls from me, he does not have the repo with my fixes.

 - The maintainer reviewed my feature1 and now needs to work on feature2.
   That however requires new upstream version of super. Therefore he needs to
   pull alternatively from both upstream and my repo with super, depending on
   what he works on.

   For the most complex case, assume here that I add more fixes to sublib
   while author of feature2 uses more and more bleeding edge stuff, so the
   maintainer really needs futher changes in sublib from both repos.

 - Also git has to fail safe if I forget to push the sublib, so when the
   maintainer tries to pull super, the refered revision of sublib simply
   won't be found.

I am not sure how to handle these cases. But they are cases that can happen
in real life and should be handled somehow. Even if some of them just require
some manual configuration.

Here is one possible idea:

We could store the GIT_DIR of submodule within the GIT_DIR of the
superproject instead of the submodule directory itself. So instead of:
 /
 /.git
 /subdir
 /subdir/.git

There would be:
 /
 /.git
 /subdir
 /.git/submodules/submodule-name.git

This would require changes to the logic how git finds GIT_DIR (which would be
really deep change), but it would provide place to store the submodule data
while the submodule is not being checked out. 

This does not address the last two cases above with mutliple sources, each
containing some revisions. There I see two options:

 - The submodules are fetched during superproject fetch (based on them being
   configured, even if they are not checked out) and the URL might depend on
   url configured for superproject. That is:
      git fetch --submodules foobar
   would do roughty:
      for GIT_DIR in $GIT_DIR/submodules/*.git; do
         git fetch foobar || git fetch
      done
   So if you configured source of the same name for the subproject, it would
   be pulled, otherwise the default one would.

   Checkout would then be local-only operation, because subprojects are
   up-to-date.

 - The superproject checkout would try fetching all sources of the
   subproject, until the requested revision is found.

   This could be extended to normal checkout doing it as well --
   "git checkout sha1" would try fetching all configured sources if the
   revision was not found.

Perhaps we could actually do both. That is, "git fetch --subprojects" to
also fetch all of "$GIT_DIR/submodules/*.git" and checkout to try fetching if
it can't find the desired revision.

On Sun, May 20, 2007 at 11:33:17 -0700, Junio C Hamano wrote:
> Jan Hudec <bulb@xxxxxx> writes:
> > IMHO it makes more sense to fetch during fetch of superproject:
> >
> >  - If you don't fetch the superproject, it won't start refering to
> >    unavailable commit of subproject. So should only need to fetch subproject
> >    after fetching superproject.
> 
> Eh, I was suggesting that the subproject fetch would come after
> checkout in "fetch and then checkout" sequence of the
> superproject, and if you are arguing against it, you should
> justify why it should not happen before checkout, as we both
> agree it should come after fetch of superproject.  Your argument
> is like saying you have to git-init before doing anything so
> you should fetch when you git-init.  That's not a justification.

It definitely has to come after fetch on superproject. My original thought
was, that it would be weird if it was part of the checkout itself, meaning
even checkout that does not follow a fetch. However I thought about it some
more and that might conflict with other requirements.

> >  - If you fetch from more than one location, you want to fetch subproject
> >    from location corresponding to where you fetch superproject from.
> 
> Not at all.  There is no reason to believe that the case that
> superproject and subproject come from related URLs is more
> common.  One of the reasons to do a separated project

I definitely don't think it's more common. But it's the harder case and it
might happen. Generally it will happen if some people work on both the
superproject and the subproject. Of course the argument is that than it
should not be separate projects, but maybe the teams just partly overlap.

Example of this situation is given above. IMHO it needs to be handled
somehow (probably git would have to check all potential sources whether they
have the revision in question).

-- 
						 Jan 'Bulb' Hudec <bulb@xxxxxx>

Attachment: signature.asc
Description: Digital signature


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux