Just like gitmodules(5), gitattributes(5), gitcredentials(7), gitnamespaces(7), gittutorial(7), we'd like to provide some background on submodules, which is not specific to the `submodule` command, but elaborates on the background and its intended usage. Add gitsubmodules(7), that explains the states, structure and usage of submodules. Signed-off-by: Stefan Beller <sbeller@xxxxxxxxxx> --- This would replace the last patch of sb/submodule-doc, though it's still RFC. In this revision I took care of the technical details (i.e. proper formatting, spelling), and only slight rewording of the text. The main issue persists; see bottom of the patch: SAMPLE WORKFLOWS (RFC/TODO) --------------------------- Do we need * an opinionated way to check for a specific state of a submodule * (submodule helper to be plumbing?) * expose the design mistake of having the (name->path) mapping inside the working tree, i.e. never remove a name from the submodule config even when the submodule doesn't exist any more. Any opinion on these would be welcome! Thanks, Stefan Documentation/Makefile | 1 + Documentation/git-submodule.txt | 36 ++------ Documentation/gitsubmodules.txt | 194 ++++++++++++++++++++++++++++++++++++++++ 3 files changed, 200 insertions(+), 31 deletions(-) create mode 100644 Documentation/gitsubmodules.txt diff --git a/Documentation/Makefile b/Documentation/Makefile index b43d66eae6..325c4735a7 100644 --- a/Documentation/Makefile +++ b/Documentation/Makefile @@ -31,6 +31,7 @@ MAN7_TXT += giteveryday.txt MAN7_TXT += gitglossary.txt MAN7_TXT += gitnamespaces.txt MAN7_TXT += gitrevisions.txt +MAN7_TXT += gitsubmodules.txt MAN7_TXT += gittutorial-2.txt MAN7_TXT += gittutorial.txt MAN7_TXT += gitworkflows.txt diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt index 4a4cede144..d38aa2d53a 100644 --- a/Documentation/git-submodule.txt +++ b/Documentation/git-submodule.txt @@ -24,37 +24,7 @@ DESCRIPTION ----------- Inspects, updates and manages submodules. -A submodule allows you to keep another Git repository in a subdirectory -of your repository. The other repository has its own history, which does not -interfere with the history of the current repository. This can be used to -have external dependencies such as third party libraries for example. - -When cloning or pulling a repository containing submodules however, -these will not be checked out by default; the 'init' and 'update' -subcommands will maintain submodules checked out and at -appropriate revision in your working tree. - -Submodules are composed from a so-called `gitlink` tree entry -in the main repository that refers to a particular commit object -within the inner repository that is completely separate. -A record in the `.gitmodules` (see linkgit:gitmodules[5]) file at the -root of the source tree assigns a logical name to the submodule and -describes the default URL the submodule shall be cloned from. -The logical name can be used for overriding this URL within your -local repository configuration (see 'submodule init'). - -Submodules are not to be confused with remotes, which are other -repositories of the same project; submodules are meant for -different projects you would like to make part of your source tree, -while the history of the two projects still stays completely -independent and you cannot modify the contents of the submodule -from within the main project. -If you want to merge the project histories and want to treat the -aggregated whole as a single project from then on, you may want to -add a remote for the other project and use the 'subtree' merge strategy, -instead of treating the other project as a submodule. Directories -that come from both projects can be cloned and checked out as a whole -if you choose to go that route. +For more information about submodules, see linkgit:gitsubmodules[5] COMMANDS -------- @@ -420,6 +390,10 @@ This file should be formatted in the same way as `$GIT_DIR/config`. The key to each submodule url is "submodule.$name.url". See linkgit:gitmodules[5] for details. +SEE ALSO +-------- +linkgit:gitsubmodules[1], linkgit:gitmodules[1]. + GIT --- Part of the linkgit:git[1] suite diff --git a/Documentation/gitsubmodules.txt b/Documentation/gitsubmodules.txt new file mode 100644 index 0000000000..3369d55ae9 --- /dev/null +++ b/Documentation/gitsubmodules.txt @@ -0,0 +1,194 @@ +gitsubmodules(7) +================ + +NAME +---- +gitsubmodules - information about submodules + +SYNOPSIS +-------- +$GIT_DIR/config, .gitmodules + +------------------ +git submodule +------------------ + +DESCRIPTION +----------- + +A submodule allows you to keep another Git repository in a subdirectory +of your repository. The other repository has its own history, which does not +interfere with the history of the current repository. This can be used to +have external dependencies such as third party libraries for example. + +Submodules are composed from a so-called `gitlink` tree entry +in the main repository that refers to a particular commit object +within the inner repository that is completely separate. +A record in the `.gitmodules` (see linkgit:gitmodules[5]) file at the +root of the source tree assigns a logical name to the submodule and +describes the default URL the submodule shall be cloned from. +The logical name can be used for overriding this URL within your +local repository configuration (see 'submodule init'). + +Submodules are not to be confused with remotes, which are other +repositories of the same project; submodules are meant for +different projects you would like to make part of your source tree, +while the history of the two projects still stays completely +independent and you cannot modify the contents of the submodule +from within the main project. +If you want to merge the project histories and want to treat the +aggregated whole as a single project from then on, you may want to +add a remote for the other project and use the 'subtree' merge strategy, +instead of treating the other project as a submodule. Directories +that come from both projects can be cloned and checked out as a whole +if you choose to go that route. + +When cloning or pulling a repository containing submodules however, +the submodules will not be checked out by default; You need to instruct +'clone' to recurse into submodules. The 'init' and 'update' subcommands +of 'git submodule' will maintain submodules checked out and at an +appropriate revision in your working tree. + +WHEN TO USE +----------- + +Submodules, repositories inside other repositories, +can be used for different use cases: + +* To have finer grained access control. + The design principles of Git do not allow for partial repositories to be + checked out or transferred. A repository is the smallest unit that a user + can be given access to. Submodules are separate repositories, such that + you can restrict access to parts of your project via the use of submodules. + +* To decouple Git histories. + Decoupling histories has different benefits. + +** When you want to use a (third party) library tied to a specific version. + Using submodules for a library allows you to have a clean history for + your own project and only updating the library in the submodule when needed. + +** In its current form Git scales up poorly for very large repositories that + change a lot, as the history grows very large. For that you may want to look + at shallow clone, sparse checkout or git-lfs. + However you can also use submodules to e.g. hold large binary assets + and these repositories are then shallowly cloned such that you do not + have a large history locally. + +STATES +------ + +When working with submodules, you can think of them as in a state machine. +So each submodule can be in a different state, the following indicators are used: + +* the existence of the setting of 'submodule.<name>.url' in the + superprojects configuration +* the existence of the submodules working tree within the + working tree of the superproject +* the existence of the submodules git directory within the superprojects + git directory at $GIT_DIR/modules/<name> or within the submodules working + tree + + State URL config working tree git dir + ----------------------------------------------------- + uninitialized no no no + initialized yes no no + populated yes yes yes + depopulated yes no yes + deinitialized no no yes + uninteresting no yes yes + + invalid no yes no + invalid yes yes no + ----------------------------------------------------- + +The first six states can be reached by normal git usage, the latter two are +only shown for completeness to show all possible eight states with 3 binary +indicators. The states in detail: + +uninitialized:: +The uninitialized state is the default state if no +'--recurse-submodules' / '--recursive'. An empty directory will be put in +the working tree as a place holder, such that you are reminded of the +existence of the submodule. +--- +To transition into the initialized state +you can use 'git submodule init', which copies the presets from the +.gitmodules file into the config. + +initialized:: +Users transitioned from the uninitialized state to this state via +'git submodule init', which preset the URL configuration. As these URLs +may not be desired in certain scenarios, this state allows to change the +URLs. For example in a corporate environment you may want to run + + sed -i s/example.org/$internal-mirror/ .git/config ++ +before proceeding to populate the submodules. + +populated:: +In the populated state you have the submodule fully available, i.e. the git +directory exists as well the working tree exists. In this state you can work +with the submodule, just like with any other repository. + +depopulated:: +In this state you still have the git directory around, but the working tree +is gone. For example when the superproject checks out a revision that doesn't +have the submodule, the state may change to depopulated. + +deinitialized:: +The git directory is still there, but the user is no longer interested in the +submodule as indicated by the missing URL configuration. + +invalid:: +When there is no git directory for a submodule, then there is something +seriously wrong with the submodule. + +INNER WORKINGS +-------------- + +Generally a submodule can be considered its own autonomous repository, +that has a worktree and a git directory at split places. + +The superproject only records the commit sha1 in its tree, such that +any other information, e.g. where to obtain a copy from, is not recorded +in the core data structures of Git. The porcelain layer of Git however +makes use of the .gitmodules file that gives strong hints where and how +to obtain a copy of the submodules git repository from. + +On the location of the git directory +------------------------------------ + +Since v1.7.7 of Git, the git directory of submodules is stored inside the +superprojects git directory at $GIT_DIR/modules/<submodule-name> +This location allows for the working tree to be non existent while keeping +the history around. So we can use git-rm on a submodule without loosing +information that may only be local. + +In the future we may see git-checkout that can checkout submodules and +revisions that do not contain the submodule can still be checked out without +having to drop the submodules git directory. + +It is also possible to imagine a future in which a bare repository still +contains its submodules inside the modules sub directory, such that you can +get a full clone including submodules from that bare repository, the URLs +as configured or given in the .gitmodules would only be used as a backup. + +SAMPLE WORKFLOWS (RFC/TODO) +--------------------------- + +Do we need + +* an opinionated way to check for a specific state of a submodule +* (submodule helper to be plumbing?) +* expose the design mistake of having the (name->path) mapping inside the + working tree, i.e. never remove a name from the submodule config even when + the submodule doesn't exist any more. + +SEE ALSO +-------- +linkgit:git-submodule[1], linkgit:gitmodules[1]. + +GIT +--- +Part of the linkgit:git[1] suite -- 2.12.0.rc0.1.g018cb5e6f4