Re: [PATCH v2] dir: check for single file cone patterns

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



William Sprent via GitGitGadget wrote:
> From: William Sprent <williams@xxxxxxxxxxx>
> 
> The sparse checkout documentation states that the cone mode pattern set
> is limited to patterns that either recursively include directories or
> patterns that match all files in a directory. In the sparse checkout
> file, the former manifest in the form:
> 
>     /A/B/C/
> 
> while the latter become a pair of patterns either in the form:
> 
>     /A/B/
>     !/A/B/*/
> 
> or in the special case of matching the toplevel files:
> 
>     /*
>     !/*/
> 
> The 'add_pattern_to_hashsets()' function contains checks which serve to
> disable cone-mode when non-cone patterns are encountered. However, these
> do not catch when the pattern list attempts to match a single file or
> directory, e.g. a pattern in the form:
> 
>     /A/B/C
> 
> This causes sparse-checkout to exhibit unexpected behaviour when such a
> pattern is in the sparse-checkout file and cone mode is enabled.
> Concretely, with the pattern like the above, sparse-checkout, in
> non-cone mode, will only include the directory or file located at
> '/A/B/C'. However, with cone mode enabled, sparse-checkout will instead
> just manifest the toplevel files but not any file located at '/A/B/C'.
> 
> Relatedly, issues occur when supplying the same kind of filter when
> partial cloning with '--filter=sparse:oid=<oid>'. 'upload-pack' will
> correctly just include the objects that match the non-cone pattern
> matching. Which means that checking out the newly cloned repo with the
> same filter, but with cone mode enabled, fails due to missing objects.

This clearly explains the issue you're trying to fix: cone mode
sparse-checkout requires patterns like '/A/B/', !/A/B/*/', '/A/B/C/', etc.,
but invalid patterns like '/A/B/C' (no trailing slash) currently don't force
the switch to non-cone mode, leading to unexpected behavior.

> 
> To fix these issues, add a cone mode pattern check that asserts that
> every pattern is either a directory match or the pattern '/*'. Add a
> test to verify the new pattern check and modify another to reflect that
> non-directory patterns are caught earlier.

I think this is the best way to maintain the current intended behavior of
cone mode sparse-checkout. While the idea of single file "exceptions" to
cone patterns has been brought up in the past (I think most recently at the
Contributor's Summit this past September [1]), it'd be a substantial change
that's definitely out-of-scope of this small bugfix.

[1] https://lore.kernel.org/git/YzXwOsaoCdBhHsX1@nand.local/

> 
> Signed-off-by: William Sprent <williams@xxxxxxxxxxx>
> ---
>     dir: check for single file cone patterns
>     
>     Resubmitting this without the superfluous double negation of the
>     'strcmp' output.
> 
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1446%2Fwilliams-unity%2Fws%2Fsparse-checkout-pattern-match-v2
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1446/williams-unity/ws/sparse-checkout-pattern-match-v2
> Pull-Request: https://github.com/gitgitgadget/git/pull/1446
> 
> Range-diff vs v1:
> 
>  1:  cfb4b75c378 ! 1:  d5406e62f6f dir: check for single file cone patterns
>      @@ dir.c: static void add_pattern_to_hashsets(struct pattern_list *pl, struct path_
>        	}
>        
>       +	if (!(given->flags & PATTERN_FLAG_MUSTBEDIR) &&
>      -+	    !!strcmp(given->pattern, "/*")) {
>      ++	    strcmp(given->pattern, "/*")) {
>       +		/* Not a cone pattern. */
>       +		warning(_("unrecognized pattern: '%s'"), given->pattern);
>       +		goto clear_hashmaps;
> 
> 
>  dir.c                              |  7 +++++++
>  t/t1091-sparse-checkout-builtin.sh | 11 ++++++++++-
>  2 files changed, 17 insertions(+), 1 deletion(-)
> 
> diff --git a/dir.c b/dir.c
> index fbdb24fc819..4e99f0c868f 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -732,6 +732,13 @@ static void add_pattern_to_hashsets(struct pattern_list *pl, struct path_pattern
>  		goto clear_hashmaps;
>  	}
>  
> +	if (!(given->flags & PATTERN_FLAG_MUSTBEDIR) &&
> +	    strcmp(given->pattern, "/*")) {

This condition is only entered if:

- the pattern *does not* end in a trailing slash 
  ('!(given->flags & PATTERN_FLAG_MUSTBEDIR)'), AND
- the pattern *does not* start with '/*' ('strcmp(given->pattern, "/*")')

The '/*'-like patterns are already handled by other parts of
'add_pattern_to_hashsets()', so only the '/A/B/C'-like patterns will trigger
the "unrecognized" warning. Looks good!


> +		/* Not a cone pattern. */
> +		warning(_("unrecognized pattern: '%s'"), given->pattern);
> +		goto clear_hashmaps;
> +	}
> +
>  	prev = given->pattern;
>  	cur = given->pattern + 1;
>  	next = given->pattern + 2;
> diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
> index b563d6c263e..627267be153 100755
> --- a/t/t1091-sparse-checkout-builtin.sh
> +++ b/t/t1091-sparse-checkout-builtin.sh
> @@ -238,7 +238,7 @@ test_expect_success 'cone mode: match patterns' '
>  test_expect_success 'cone mode: warn on bad pattern' '
>  	test_when_finished mv sparse-checkout repo/.git/info/ &&
>  	cp repo/.git/info/sparse-checkout . &&
> -	echo "!/deep/deeper/*" >>repo/.git/info/sparse-checkout &&
> +	echo "!/deep/deeper/*/" >>repo/.git/info/sparse-checkout &&

This test changes because, without the trailing slash on the pattern, the
error would become "unrecognized pattern: !/deep/deeper/*" due to the new
check in 'add_pattern_to_hashsets()'. Changing the test pattern ensures
we're still exercising what the test was originally intended to exercise,
and doesn't indicate a regression.

>  	git -C repo read-tree -mu HEAD 2>err &&
>  	test_i18ngrep "unrecognized negative pattern" err
>  '
> @@ -667,6 +667,15 @@ test_expect_success 'pattern-checks: starting "*"' '
>  	check_read_tree_errors repo "a deep" "disabling cone pattern matching"
>  '
>  
> +test_expect_success 'pattern-checks: non directory pattern' '
> +	cat >repo/.git/info/sparse-checkout <<-\EOF &&
> +	/deep/deeper1/a
> +	EOF
> +	check_read_tree_errors repo deep "disabling cone pattern matching" &&
> +	check_files repo/deep deeper1 &&
> +	check_files repo/deep/deeper1 a
> +'

And this test ensures the new check is working for the appropriate patterns.

This patch looks good to me, thanks for finding & fixing the bug!

> +
>  test_expect_success 'pattern-checks: contained glob characters' '
>  	for c in "[a]" "\\" "?" "*"
>  	do
> 
> base-commit: 7c2ef319c52c4997256f5807564523dfd4acdfc7




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux