Re: [PATCHv8 5/5] pathspec: allow querying for attributes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Stefan Beller <sbeller@xxxxxxxxxx> writes:

> diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt
> index cafc284..aa9f220 100644
> --- a/Documentation/glossary-content.txt
> +++ b/Documentation/glossary-content.txt
> @@ -384,6 +384,23 @@ full pathname may have special meaning:
>  +
>  Glob magic is incompatible with literal magic.
>  
> +attr;;
> +	Additionally to matching the pathspec, the path must have the
> +	attribute as specified. The syntax for specifying the required
> +	attributes is "`attr: [mode] <attribute name> [=value]`"
> ++
> +Attributes can have 4 states (Set, Unset, Set to a value, unspecified) and
> +you can query each attribute for certain states. The "`[mode]`" is a special
> +character to indicate which attribute states are looked for. The following
> +modes are available:
> +
> + - an empty "`[mode]`" matches if the attribute is set
> + - "`-`" the attribute must be unset
> + - "`!`" the attribute must be unspecified
> + - an empty "`[mode]`" combined with "`[=value]`" matches if the attribute has
> +   the given value.
> ++

As an initial design, I find this much more agreeable than the
previous rounds.  I however find the phrasing of the above harder
than necessary to understand, for a few reasons.

 * Mixed use of "X matches if ..." and "... must be Y" makes it
   unclear if they are talking about different kind of things, or
   the same kind of things in merely different ways.

 * It does not make it clear "=value" is only meaningful when [mode]
   is empty.

Perhaps dropping the '[mode]' thing altogether and instead saying

        After `attr:` comes a space separated list of "attribute
       	requirements", all of which must be met in order for the
       	path to be considered a match; this is in addition to the
       	usual non-magic pathspec pattern matching.

	Each of the attribute requirements for the path takes one of
	these forms:

	- "`ATTR`" requires that the attribute `ATTR` must be set.

	- "`-ATTR`" requires that the attribute `ATTR` must be unset.

        - "`ATTR=VALUE`" requires that the attribute `ATTR` must be
          set to the string `VALUE`.

        - "`!ATTR`" requires that the attribute `ATTR` must be
          unspecified.

would make the resulting text easier to read?

> +static int match_attrs(const char *name, int namelen,
> +		       const struct pathspec_item *item)
> +{
> +	int i;
> +
> +	git_check_attr_counted(name, namelen, item->attr_check);
> +	for (i = 0; i < item->attr_match_nr; i++) {
> +		const char *value;
> +		int matched;
> +		enum attr_match_mode match_mode;
> +
> +		value = item->attr_check->check[i].value;
> +		match_mode = item->attr_match[i].match_mode;
> +
> +		if (ATTR_TRUE(value))
> +			matched = match_mode == MATCH_SET;
> +		else if (ATTR_FALSE(value))
> +			matched = match_mode == MATCH_UNSET;
> +		else if (ATTR_UNSET(value))
> +			matched = match_mode == MATCH_UNSPECIFIED;

readability nit:

	matched = (match_mode == MATCH_WHATEVER);

would be easier to view

> +		else
> +			matched = (match_mode == MATCH_VALUE &&
> +				   !strcmp(item->attr_match[i].value, value));

and would match the last case above better.

> +static void parse_pathspec_attr_match(struct pathspec_item *item, const char *value)
> +{
> +	struct string_list_item *si;
> +	struct string_list list = STRING_LIST_INIT_DUP;
> +
> +
> +	if (!value || !strlen(value))
> +		die(_("attr spec must not be empty"));
> +
> +	string_list_split(&list, value, ' ', -1);
> +	string_list_remove_empty_items(&list, 0);
> +
> +	if (!item->attr_check)
> +		item->attr_check = git_attr_check_alloc();
> +	else
> +		die(_("Only one 'attr:' specification is allowed."));
> +
> +	ALLOC_GROW(item->attr_match, item->attr_match_nr + list.nr, item->attr_match_alloc);
> +
> +	for_each_string_list_item(si, &list) {
> +		size_t attr_len;
> +
> +		int j = item->attr_match_nr++;
> +		const char *attr = si->string;
> +		struct attr_match *am = &item->attr_match[j];
> +
> +		if (attr[0] == '!')
> +			am->match_mode = MATCH_UNSPECIFIED;
> +		else if (attr[0] == '-')
> +			am->match_mode = MATCH_UNSET;
> +		else
> +			am->match_mode = MATCH_SET;
> +
> +		if (am->match_mode != MATCH_SET)
> +			/* skip first character */
> +			attr++;
> +		attr_len = strcspn(attr, "=");
> +		if (attr[attr_len] == '=') {
> +			am->match_mode = MATCH_VALUE;
> +			am->value = xstrdup(&attr[attr_len + 1]);
> +			if (strchr(am->value, '\\'))
> +				die(_("attr spec values must not contain backslashes"));
> +		} else
> +			am->value = NULL;
> +

Let me try a variant to see if we can improve it (thinking aloud).

	switch (*attr) {
        case '!':
        	am->match_mode = MATCH_UNSPECIFIED;
		attr++;
                break;
        case '-':
        	am->match_mode = MATCH_UNSET;
		attr++;
                break;
	default:
		attr_len = strcspn(attr, "=");
		if (attr[attr_len] != '=')
			am->match_mode = MATCH_SET;
		else {
			am->match_mode = MATCH_VALUE;
	                am->value = xstrdup(...);
		}
                break;
	}

Using switch/case does not save line counts at all but it may still
make the result easier to follow.  More importantly, it would help
you catch "!ATTR=VAR" as an "invalid attribute name 'ATTR=VAR' was
requested" error by arranging the code to make sure that scanning
for '=' would not happen in !/- case (in other words, while thiking
aloud with an alternative way to write the same thing, I spotted a
bug in the original ;-).

> +		if (!attr_name_valid(attr, attr_len)) {
> +			struct strbuf sb = STRBUF_INIT;
> +			am->match_mode = INVALID_ATTR;
> +			invalid_attr_name_message(&sb, attr, attr_len);
> +			die(_("invalid attribute in '%s': '%s'"), value, sb.buf);
> +		}
> +
> +		am->attr = git_attr_counted(attr, attr_len);

I'd do this the other order, i.e.

	am->attr = git_attr_counted(...);
        if (!am->attr) {
		...
                die(...);
	}

It is wasteful to call attr_name_valid() yourself before seeing an
error from git_attr_counted().

> diff --git a/t/t6134-pathspec-with-labels.sh b/t/t6134-pathspec-with-labels.sh
> new file mode 100755
> index 0000000..c0d8cda
> --- /dev/null
> +++ b/t/t6134-pathspec-with-labels.sh
> @@ -0,0 +1,166 @@
> +#!/bin/sh
> +
> +test_description='test labels in pathspecs'
> +. ./test-lib.sh
> +
> +test_expect_success 'setup a tree' '
> +	mkdir sub &&
> +	for p in fileA fileB fileC fileAB fileAC fileBC fileNoLabel fileUnsetLabel fileSetLabel fileValue fileWrongLabel; do
> +		: >$p &&
> +		git add $p &&
> +		: >sub/$p
> +		git add sub/$p
> +	done &&
> +	git commit -m $p &&
> +	git ls-files >actual &&
> +	cat <<EOF >expect &&
> +fileA
> +fileAB
> +fileAC
> +fileB
> +fileBC
> +fileC
> +fileNoLabel
> +fileSetLabel
> +fileUnsetLabel
> +fileValue
> +fileWrongLabel
> +sub/fileA
> +sub/fileAB
> +sub/fileAC
> +sub/fileB
> +sub/fileBC
> +sub/fileC
> +sub/fileNoLabel
> +sub/fileSetLabel
> +sub/fileUnsetLabel
> +sub/fileValue
> +sub/fileWrongLabel
> +EOF

	cat <<-\EOF >expect &&
        fileA
        ...
        EOF

to indent the whole thing?

> +test_expect_success 'setup .gitattributes' '
> +	cat <<EOF >.gitattributes &&

Likewise.

> +fileWrongLabel label☺
> +EOF
> +	git add .gitattributes &&
> +	git commit -m "add attributes"
> +'
> +
> +sq="'"

I do not think you use this $sq variable below, but I may have
overlooked its use.

> +
> +test_expect_success 'check specific set attr' '
> +	cat <<EOF >expect &&
> +fileSetLabel
> +sub/fileSetLabel
> +EOF
> +	git ls-files ":(attr:label)" >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'check specific unset attr' '
> +	cat <<EOF >expect &&
> +fileUnsetLabel
> +sub/fileUnsetLabel
> +EOF
> +	git ls-files ":(attr:-label)" >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'check specific value attr' '
> +	cat <<EOF >expect &&
> +fileValue
> +sub/fileValue
> +EOF
> +	git ls-files ":(attr:label=foo)" >actual &&
> +	test_cmp expect actual &&
> +	git ls-files ":(attr:label=bar)" >actual &&
> +	test_must_be_empty actual
> +'
> +
> +test_expect_success 'check unspecified attr' '
> +	cat <<EOF >expect &&
> +.gitattributes
> +...
> +EOF
> +	git ls-files :\(attr:\!label\) >actual &&

Why not the more normal-looking ":(attr:!label)"?  I do not think
history substitution would trigger in an script.

And no, "I may cut and paste into my interactive shell while
debugging" is not a good reason to make the end-result (i.e. script)
less readable.

> +test_expect_success 'check multiple unspecified attr' '
> +	cat <<EOF >expect &&
> +.gitattributes
> +fileC
> +fileNoLabel
> +fileWrongLabel
> +sub/fileC
> +sub/fileNoLabel
> +sub/fileWrongLabel
> +EOF
> +	git ls-files :\(attr:\!labelB\ \!labelA\ \!label\) >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'check label with more labels but excluded path' '
> +	cat <<EOF >expect &&
> +fileAB
> +fileB
> +fileBC
> +EOF
> +	git ls-files ":(attr:labelB)" ":(exclude)sub/" >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'check label excluding other labels' '
> +	cat <<EOF >expect &&
> +fileAB
> +fileB
> +fileBC
> +sub/fileAB
> +sub/fileB
> +EOF
> +	git ls-files ":(attr:labelB)" ":(exclude,attr:labelC)sub/" >actual &&
> +	test_cmp expect actual
> +'
> +
> +test_expect_success 'abort on giving invalid label on the command line' '
> +	test_must_fail git ls-files . ":(attr:☺)" 2>actual &&
> +	test_i18ngrep "fatal" actual
> +'
> +
> +test_done
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]