On Wed, Jun 10, 2020 at 4:16 PM Elijah Newren via GitGitGadget <gitgitgadget@xxxxxxxxx> wrote: > > From: Elijah Newren <newren@xxxxxxxxx> > > Ignoring the sparse-checkout feature momentarily, if one has a submodule and > creates local branches within it with unpushed changes and maybe adds some > untracked files to it, then we would want to avoid accidentally removing such > a submodule. So, for example with git.git, if you run > git checkout v2.13.0 > then the sha1collisiondetection/ submodule is NOT removed even though it > did not exist as a submodule until v2.14.0. Similarly, if you only had > v2.13.0 checked out previously and ran > git checkout v2.14.0 > the sha1collisiondetection/ submodule would NOT be automatically > initialized despite being part of v2.14.0. In both cases, git requires > submodules to be initialized or deinitialized separately. Further, we > also have special handling for submodules in other commands such as > clean, which requires two --force flags to delete untracked submodules, > and some commands have a --recurse-submodules flag. > > sparse-checkout is very similar to checkout, as evidenced by the similar > name -- it adds and removes files from the working copy. However, for > the same avoid-data-loss reasons we do not want to remove a submodule > from the working copy with checkout, we do not want to do it with > sparse-checkout either. So submodules need to be separately initialized > or deinitialized; changing sparse-checkout rules should not > automatically trigger the removal or vivification of submodules. > > I believe the previous wording in git-sparse-checkout.txt about > submodules was only about this particular issue. Unfortunately, the > previous wording could be interpreted to imply that submodules should be > considered active regardless of sparsity patterns. Update the wording > to avoid making such an implication. It may be helpful to consider two > example situations where the differences in wording become important: > > In the future, we want users to be able to run commands like > git clone --sparse=moduleA --recurse-submodules $REPO_URL > and have sparsity paths automatically set up and have submodules *within > the sparsity paths* be automatically initialized. We do not want all > submodules in any path to be automatically initialized with that > command. > > Similarly, we want to be able to do things like > git -c sparse.restrictCmds grep --recurse-submodules $REV $PATTERN > and search through $REV for $PATTERN within the recorded sparsity > patterns. We want it to recurse into submodules within those sparsity > patterns, but do not want to recurse into directories that do not match > the sparsity patterns in search of a possible submodule. > > Signed-off-by: Elijah Newren <newren@xxxxxxxxx> > --- > git-sparse-checkout: clarify interactions with submodules > > gitgitgadget is going to treat this like V1, but it's really V2. V1 was > an inline scissors patch. > > Changes since V1: To make the record easier for those looking over the archives, V1 is over here: https://lore.kernel.org/git/20200522142611.1217757-1-newren@xxxxxxxxx/ > * More wording clarifications in areas pointed out by Stolee, and using > some of his suggested wording. > * In particular, given that the final sentence from V1 was causing lots > of problems, I just stepped back and painted a very broad stroke for > end users that I think will make sense to them: we have two reasons > tracked files might be missing from the working copy, so there are > two things that might limit commands that search through tracked > files in the working copy. Greater detail about if or how they are > limited can be left to the manpages of individual subcommands. > > Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-805%2Fnewren%2Fsparse-submodule-interactions-v1 > Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-805/newren/sparse-submodule-interactions-v1 > Pull-Request: https://github.com/git/git/pull/805 > > Documentation/git-sparse-checkout.txt | 30 +++++++++++++++++++++++---- > 1 file changed, 26 insertions(+), 4 deletions(-) > > diff --git a/Documentation/git-sparse-checkout.txt b/Documentation/git-sparse-checkout.txt > index 1a3ace60820..c7feaeca110 100644 > --- a/Documentation/git-sparse-checkout.txt > +++ b/Documentation/git-sparse-checkout.txt > @@ -200,10 +200,32 @@ directory. > SUBMODULES > ---------- > > -If your repository contains one or more submodules, then those submodules will > -appear based on which you initialized with the `git submodule` command. If > -your sparse-checkout patterns exclude an initialized submodule, then that > -submodule will still appear in your working directory. > +If your repository contains one or more submodules, then submodules > +are populated based on interactions with the `git submodule` command. > +Specifically, `git submodule init -- <path>` will ensure the submodule > +at `<path>` is present, while `git submodule deinit [-f] -- <path>` > +will remove the files for the submodule at `<path>` (including any > +untracked files, uncommitted changes, and unpushed history). Similar > +to how sparse-checkout removes files from the working tree but still > +leaves entries in the index, deinitialized submodules are removed from > +the working directory but still have an entry in the index. > + > +Since submodules may have unpushed changes or untracked files, > +removing them could result in data loss. Thus, changing sparse > +inclusion/exclusion rules will not cause an already checked out > +submodule to be removed from the working copy. Said another way, just > +as `checkout` will not cause submodules to be automatically removed or > +initialized even when switching between branches that remove or add > +submodules, using `sparse-checkout` to reduce or expand the scope of > +"interesting" files will not cause submodules to be automatically > +deinitialized or initialized either. > + > +Further, the above facts mean that there are multiple reasons that > +"tracked" files might not be present in the working copy: sparsity > +pattern application from sparse-checkout, and submodule initialization > +state. Thus, commands like `git grep` that work on tracked files in > +the working copy may return results that are limited by either or both > +of these restrictions. > > > SEE ALSO > > base-commit: 87680d32efb6d14f162e54ad3bda4e3d6c908559 > -- > gitgitgadget