Re: [PATCH] rm: honor sparse checkout patterns

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On Thu, Nov 12, 2020 at 1:02 PM Matheus Tavares
<matheus.bernardino@xxxxxx> wrote:
>
> Make git-rm honor the 'sparse.restrictCmds' setting, by restricting its
> operation to the paths that match both the command line pathspecs and
> the repository's sparsity patterns. This better matches the expectations
> of users with sparse-checkout definitions, while still allowing them
> to optionally enable the old behavior with 'sparse.restrictCmds=false'
> or the global '--no-restrict-to-sparse-paths' option.

(For Stolee:) Did this arise when a user specified a directory to
delete, and a (possibly small) part of that directory was in the
sparse checkout while other portions of it were outside?

I can easily see users thinking they are dealing with just the files
relevant to them, and expecting the directory deletion to only affect
that relevant subset, so this seems like a great idea.  We'd just want
to make sure we have a good error message if they explicitly list a
single path outside the sparse checkout.

> Suggested-by: Derrick Stolee <stolee@xxxxxxxxx>
> Signed-off-by: Matheus Tavares <matheus.bernardino@xxxxxx>
> ---
>
> This is based on mt/grep-sparse-checkout.
> Original feature request: https://github.com/gitgitgadget/git/issues/786
>
>  Documentation/config/sparse.txt  |  3 ++-
>  Documentation/git-rm.txt         |  9 +++++++++
>  builtin/rm.c                     |  7 ++++++-
>  t/t3600-rm.sh                    | 22 ++++++++++++++++++++++
>  t/t7011-skip-worktree-reading.sh |  5 -----
>  5 files changed, 39 insertions(+), 7 deletions(-)
>
> diff --git a/Documentation/config/sparse.txt b/Documentation/config/sparse.txt
> index 494761526e..79d7d173e9 100644
> --- a/Documentation/config/sparse.txt
> +++ b/Documentation/config/sparse.txt
> @@ -12,7 +12,8 @@ When this option is true (default), some git commands may limit their behavior
>  to the paths specified by the sparsity patterns, or to the intersection of
>  those paths and any (like `*.c`) that the user might also specify on the
>  command line. When false, the affected commands will work on full trees,
> -ignoring the sparsity patterns. For now, only git-grep honors this setting.
> +ignoring the sparsity patterns. For now, only git-grep and git-rm honor this
> +setting.
>  +
>  Note: commands which export, integrity check, or create history will always
>  operate on full trees (e.g. fast-export, format-patch, fsck, commit, etc.),
> diff --git a/Documentation/git-rm.txt b/Documentation/git-rm.txt
> index ab750367fd..25dda8ff35 100644
> --- a/Documentation/git-rm.txt
> +++ b/Documentation/git-rm.txt
> @@ -25,6 +25,15 @@ When `--cached` is given, the staged content has to
>  match either the tip of the branch or the file on disk,
>  allowing the file to be removed from just the index.
>
> +CONFIGURATION
> +-------------
> +
> +sparse.restrictCmds::
> +       By default, git-rm only matches and removes paths within the
> +       sparse-checkout patterns. This behavior can be changed with the
> +       `sparse.restrictCmds` setting or the global
> +       `--no-restrict-to-sparse-paths` option. For more details, see the
> +       full `sparse.restrictCmds` definition in linkgit:git-config[1].

Hmm, I wonder what people will think who are reading through the
manual and have never used sparse-checkout.  This seems prone to
confusion for them.  Maybe instead we could word this as:

When sparse-checkouts are in use, by default git-rm will only match
and remove paths within the sparse-checkout patterns...

>
>  OPTIONS
>  -------
> diff --git a/builtin/rm.c b/builtin/rm.c
> index 4858631e0f..e1fe71c321 100644
> --- a/builtin/rm.c
> +++ b/builtin/rm.c
> @@ -14,6 +14,7 @@
>  #include "string-list.h"
>  #include "submodule.h"
>  #include "pathspec.h"
> +#include "sparse-checkout.h"
>
>  static const char * const builtin_rm_usage[] = {
>         N_("git rm [<options>] [--] <file>..."),
> @@ -254,7 +255,7 @@ static struct option builtin_rm_options[] = {
>  int cmd_rm(int argc, const char **argv, const char *prefix)
>  {
>         struct lock_file lock_file = LOCK_INIT;
> -       int i;
> +       int i, sparse_paths_only;
>         struct pathspec pathspec;
>         char *seen;
>
> @@ -293,8 +294,12 @@ int cmd_rm(int argc, const char **argv, const char *prefix)
>
>         seen = xcalloc(pathspec.nr, 1);
>
> +       sparse_paths_only = restrict_to_sparse_paths(the_repository);
> +
>         for (i = 0; i < active_nr; i++) {
>                 const struct cache_entry *ce = active_cache[i];
> +               if (sparse_paths_only && ce_skip_worktree(ce))
> +                       continue;
>                 if (!ce_path_match(&the_index, ce, &pathspec, seen))
>                         continue;
>                 ALLOC_GROW(list.entry, list.nr + 1, list.alloc);
> diff --git a/t/t3600-rm.sh b/t/t3600-rm.sh
> index efec8d13b6..7bf55b42eb 100755
> --- a/t/t3600-rm.sh
> +++ b/t/t3600-rm.sh
> @@ -892,4 +892,26 @@ test_expect_success 'rm empty string should fail' '
>         test_must_fail git rm -rf ""
>  '
>
> +test_expect_success 'rm should respect --[no]-restrict-to-sparse-paths' '
> +       git init sparse-repo &&
> +       (
> +               cd sparse-repo &&
> +               touch a b c &&
> +               git add -A &&
> +               git commit -m files &&
> +               git sparse-checkout set "/a" &&
> +
> +               # By default, it should not rm paths outside the sparse-checkout
> +               test_must_fail git rm b 2>stderr &&
> +               test_i18ngrep "fatal: pathspec .b. did not match any files" stderr &&

Ah, this answers my question about whether the user gets an error
message when they explicitly call out a single path outside the sparse
checkout.  I'm curious if we want to be slightly more verbose on the
error message when sparse-checkouts are in effect.  In particular, if
no paths match the sparsity patterns, but some paths would have
matched the pathspec ignoring the sparsity patterns, then perhaps the
error message should include a reference to the
--no-restrict-to-sparse-paths flag.

> +
> +               # But it should rm them with --no-restrict-to-sparse-paths
> +               git --no-restrict-to-sparse-paths rm b &&
> +
> +               # And also with sparse.restrictCmds=false
> +               git reset &&
> +               git -c sparse.restrictCmds=false rm b
> +       )
> +'
> +
>  test_done

Do we also want to include a testcase where the user specifies a
directory and part of that directory is within the sparsity paths and
part is out?  E.g.  'git sparse-checkout set /sub/dir && git rm -r
sub' ?

> diff --git a/t/t7011-skip-worktree-reading.sh b/t/t7011-skip-worktree-reading.sh
> index 26852586ac..1761a2b1b9 100755
> --- a/t/t7011-skip-worktree-reading.sh
> +++ b/t/t7011-skip-worktree-reading.sh
> @@ -132,11 +132,6 @@ test_expect_success 'diff-files does not examine skip-worktree dirty entries' '
>         test -z "$(git diff-files -- one)"
>  '
>
> -test_expect_success 'git-rm succeeds on skip-worktree absent entries' '
> -       setup_absent &&
> -       git rm 1
> -'
> -
>  test_expect_success 'commit on skip-worktree absent entries' '
>         git reset &&
>         setup_absent &&
> --
> 2.28.0

Sweet, nice and simple.  Thanks for sending this in; I think it'll be very nice.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux