Re: [RFC] Submodules in GIT

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Few thoughts on this topic. Some of those are repeating what
was said eaelier

1. Submodule (subproject) as commit-in-a-tree

Let's try to paint a little diagram (attribution missing):

belonging to:
/--------- supermodule -------\    /---- submodule -------\

commit -> tree +-> blob
  |            +-> tree -> ...
  |            +-----------------> commit -> tree -> ...
  v                                  |
commit -> tree +-> ...               v
  |            +-----------------> commit -> ...
  v                        /         |
commit -> tree +-> ...    /          |
  |            +---------/           v
  |                                commit -> ...
  v                                  |
commit -> tree +-> ...               v
               +-----------------> commit

Both have their independent history, but they are linked as some
submodule versions are part of the supermodule tree.


2. Working area for project with submodules

Submodule as separate repository model
supermodule
+ .git/  <------------------------.
  + HEAD                          |
  + index                         |
  + objects/                      |
  + objects/info/alternates ---.  |
+ subdir1/                     |  |
  + sub1file                   |  ^                   
+ submodule/                   |  |
  + .git                       v  |
    + HEAD                     |  |
    + index                    |  |
    + objects/  <--------------'  |
    +[objects/info/borrowers] ----'
  + subsubdir/
    + submfile
+ file

Embedded submodule model
supermodule
+ .git/
  + HEAD
  + index
  + objects/
  +[refs/submodules/submodule/HEAD]
  +[refs/submodules/submodule/index]
+ subdir1/
  + sub1file
+ submodule/
  + subsubdir/
    + submfile
+ file

The [fictional] borrowers file is for git-prune and friends (also
git-repack with -d option) to not remove objects needed by supermodule
(when for example submodule history got rewritten). But you can do
without it, as long as you don't rewind or don't prune in
supermodule.

The problem with submodule as separate git repository is that if you
move submodule (subproject) somewhere else in the repository (or just
rename it), you have to update alternates file... and this happens not
only on move itself, but also on checkout and reset. But that can be
managed by having in alternates all possible places the submodule ends
into. I don't know if it is truly a problem.

Alternate solution would be to have submodule objects [also] in the
main (superproject) object database (for example fetched from
submodule object repository on supermodule commit with changing
submodule).

Perhaps instead of objects/info/alternates we should use
objects/info/modules, or even modules file (as top .git dir).


The problem with embedded submodule model is ensuring that changes in
submodule go to submodule (using submodule refs; at least HEAD and
submodule index). And there are troubles with treating submodule
separately, for example cloning submodule only, or fetching from
submodule only.


3. Output of git-ls-tree and git-ls-files (git-ls-index ;-) for
project with submodules.

$ git ls-tree HEAD
040000 tree 959dd5d97e665998eb26c764d3a889ae7903d9c2    subdir1
140000 subm ccddf1d4b0cf7fd3a699d8b33cf5bc4c5c4435b7    submodule
100644 blob a57a33b81ac6c9cb5ec0c833edc21bd66428d976    file

$ git ls-tree -r -t HEAD
040000 tree 959dd5d97e665998eb26c764d3a889ae7903d9c2    subdir1
100644 blob 70d8b9838a7333bc5a1edb93cf0e9abdbcf146cc    subdir1/sub1file
140000 subm ccddf1d4b0cf7fd3a699d8b33cf5bc4c5c4435b7    submodule
040000 tree 959dd5d97e665998eb26c764d3a889ae7903d9c2    submodule/subsubdir
100755 blob 6579f06b05c91f00f4f45015894f2bfab1076bf6    submodule/subsubdir/submfile
100644 blob a57a33b81ac6c9cb5ec0c833edc21bd66428d976    file

$ git ls-files --stages
100644 70d8b9838a7333bc5a1edb93cf0e9abdbcf146cc 0   subdir1/sub1file
140000 ccddf1d4b0cf7fd3a699d8b33cf5bc4c5c4435b7 0   submodule
100644 a57a33b81ac6c9cb5ec0c833edc21bd66428d976 0   file


4. Workflow(s) for project with submodules

$ cd submodule
submodule$ edit subsubdir/submfile
submodule$ git update-index subsubdir/submfile  # this updates submodule index
submodule$ git commit -m "Submodule change"     # this changes submodule HEAD
submodule$ cd ..
$ git update-index submodule                    # this updates index 
                                                  to submodule HEAD version
$ git commit -m "Change in submodule"           # this updates HEAD

Of course as usual you should be able to do "git commit -a" to skip
"git update-index". One has to remember that "git update-index
submodule" and "git commit submodule" uses HEAD version of submodule,
not the working area version.

There was an idea to update superproject index not to HEAD version
but to some specified branch version.


5. Extended sha1 syntax for submodules

For [almost] all commands the commit-in-tree should
be viewed as tree-ish, for example in HEAD:submodule/subsubdir (is a
tree), or HEAD:submodule/subsubdir/submfile (is a blob).

Currently a suffix ':' followed by a path names the blob or tree (or
commit) at the given path in the tree-ish object named by the part
before the colon. You cannot currently use it recirsively, i.e. use
<tree-ish>:<path> to refer to tree (or commit), and use ':' after
that, e.g. <tree-ish>:<path>:<subpath>... well, currently this has not
much sense, as you can (and have to) use '/' as a separator.

There was proposal to use '//' as a way to force commit object in the
tree to be treated as commit-ish, not as a tree, so you can apply all
the extended sha1 machinery suitable for commits like ^, ^n, ~n and
also probably ^@, but perhaps not @{n}. Then making ':' resursive
would be useful, for example:

  HEAD^:submodule//~2:subsubdir/submfile


-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]