Re: [PATCH] git-sparse-checkout: clarify interactions with submodules

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jun 10, 2020 at 4:16 PM Elijah Newren via GitGitGadget
<gitgitgadget@xxxxxxxxx> wrote:
>
> From: Elijah Newren <newren@xxxxxxxxx>
>
> Ignoring the sparse-checkout feature momentarily, if one has a submodule and
> creates local branches within it with unpushed changes and maybe adds some
> untracked files to it, then we would want to avoid accidentally removing such
> a submodule.  So, for example with git.git, if you run
>    git checkout v2.13.0
> then the sha1collisiondetection/ submodule is NOT removed even though it
> did not exist as a submodule until v2.14.0.  Similarly, if you only had
> v2.13.0 checked out previously and ran
>    git checkout v2.14.0
> the sha1collisiondetection/ submodule would NOT be automatically
> initialized despite being part of v2.14.0.  In both cases, git requires
> submodules to be initialized or deinitialized separately.  Further, we
> also have special handling for submodules in other commands such as
> clean, which requires two --force flags to delete untracked submodules,
> and some commands have a --recurse-submodules flag.
>
> sparse-checkout is very similar to checkout, as evidenced by the similar
> name -- it adds and removes files from the working copy.  However, for
> the same avoid-data-loss reasons we do not want to remove a submodule
> from the working copy with checkout, we do not want to do it with
> sparse-checkout either.  So submodules need to be separately initialized
> or deinitialized; changing sparse-checkout rules should not
> automatically trigger the removal or vivification of submodules.
>
> I believe the previous wording in git-sparse-checkout.txt about
> submodules was only about this particular issue.  Unfortunately, the
> previous wording could be interpreted to imply that submodules should be
> considered active regardless of sparsity patterns.  Update the wording
> to avoid making such an implication.  It may be helpful to consider two
> example situations where the differences in wording become important:
>
> In the future, we want users to be able to run commands like
>    git clone --sparse=moduleA --recurse-submodules $REPO_URL
> and have sparsity paths automatically set up and have submodules *within
> the sparsity paths* be automatically initialized.  We do not want all
> submodules in any path to be automatically initialized with that
> command.
>
> Similarly, we want to be able to do things like
>    git -c sparse.restrictCmds grep --recurse-submodules $REV $PATTERN
> and search through $REV for $PATTERN within the recorded sparsity
> patterns.  We want it to recurse into submodules within those sparsity
> patterns, but do not want to recurse into directories that do not match
> the sparsity patterns in search of a possible submodule.
>
> Signed-off-by: Elijah Newren <newren@xxxxxxxxx>
> ---
>     git-sparse-checkout: clarify interactions with submodules
>
>     gitgitgadget is going to treat this like V1, but it's really V2. V1 was
>     an inline scissors patch.
>
>     Changes since V1:

To make the record easier for those looking over the archives, V1 is
over here: https://lore.kernel.org/git/20200522142611.1217757-1-newren@xxxxxxxxx/


>      * More wording clarifications in areas pointed out by Stolee, and using
>        some of his suggested wording.
>      * In particular, given that the final sentence from V1 was causing lots
>        of problems, I just stepped back and painted a very broad stroke for
>        end users that I think will make sense to them: we have two reasons
>        tracked files might be missing from the working copy, so there are
>        two things that might limit commands that search through tracked
>        files in the working copy. Greater detail about if or how they are
>        limited can be left to the manpages of individual subcommands.
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-805%2Fnewren%2Fsparse-submodule-interactions-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-805/newren/sparse-submodule-interactions-v1
> Pull-Request: https://github.com/git/git/pull/805
>
>  Documentation/git-sparse-checkout.txt | 30 +++++++++++++++++++++++----
>  1 file changed, 26 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/git-sparse-checkout.txt b/Documentation/git-sparse-checkout.txt
> index 1a3ace60820..c7feaeca110 100644
> --- a/Documentation/git-sparse-checkout.txt
> +++ b/Documentation/git-sparse-checkout.txt
> @@ -200,10 +200,32 @@ directory.
>  SUBMODULES
>  ----------
>
> -If your repository contains one or more submodules, then those submodules will
> -appear based on which you initialized with the `git submodule` command. If
> -your sparse-checkout patterns exclude an initialized submodule, then that
> -submodule will still appear in your working directory.
> +If your repository contains one or more submodules, then submodules
> +are populated based on interactions with the `git submodule` command.
> +Specifically, `git submodule init -- <path>` will ensure the submodule
> +at `<path>` is present, while `git submodule deinit [-f] -- <path>`
> +will remove the files for the submodule at `<path>` (including any
> +untracked files, uncommitted changes, and unpushed history).  Similar
> +to how sparse-checkout removes files from the working tree but still
> +leaves entries in the index, deinitialized submodules are removed from
> +the working directory but still have an entry in the index.
> +
> +Since submodules may have unpushed changes or untracked files,
> +removing them could result in data loss.  Thus, changing sparse
> +inclusion/exclusion rules will not cause an already checked out
> +submodule to be removed from the working copy.  Said another way, just
> +as `checkout` will not cause submodules to be automatically removed or
> +initialized even when switching between branches that remove or add
> +submodules, using `sparse-checkout` to reduce or expand the scope of
> +"interesting" files will not cause submodules to be automatically
> +deinitialized or initialized either.
> +
> +Further, the above facts mean that there are multiple reasons that
> +"tracked" files might not be present in the working copy: sparsity
> +pattern application from sparse-checkout, and submodule initialization
> +state.  Thus, commands like `git grep` that work on tracked files in
> +the working copy may return results that are limited by either or both
> +of these restrictions.
>
>
>  SEE ALSO
>
> base-commit: 87680d32efb6d14f162e54ad3bda4e3d6c908559
> --
> gitgitgadget



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux