Re: [PATCH 5/5] pathspec: record labels

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Stefan Beller <sbeller@xxxxxxxxxx> writes:

> Labels were originally designed to manage large amount of
> submodules, the discussion steered this in a more general
> direction, such that other files can be labeled as well.

s/other files/any path/.

> Labels are meant to describe arbitrary set of files, which
> is not described via the tree layout.

Let's avoid "are meant to", which is merely to give you an excuse to
say "it was meant to ... but the implementation did not quite achieve
that goal".

    The "label" attribute can be attached to paths, and the pathspec
    mechanism is extended via the new ":(label=X)pattern/to/match"
    syntax to filter paths so that it requires paths to not just
    match the given pattern but also have the specified labels
    attached for them to be chosen.

or something?

> +Attaching labels to path
> +~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +`label`
> +^^^^^^^^^^
> +By the value of this attribute you can specify a list of arbitrary strings
> +to be attached to a path as labels. These labels can be used for
> +easier paths matching using pathspecs (linkgit:gitglossary[1]).
> +It is recommended to use only alphanumeric characters, dashes and
> +underscores for the labels.

Make this recomendation stronger to requirement.  It is always
possible to loosen it later, but once we allow random things even
with a warning, it gets impossible to take them back. Users will say
"Even though we got a warning, it used to correctly filter; now Git
is broken and my setup does not work."

> diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt
> index 8ad29e6..f990017 100644
> --- a/Documentation/glossary-content.txt
> +++ b/Documentation/glossary-content.txt
> @@ -384,6 +384,10 @@ full pathname may have special meaning:
>  +
>  Glob magic is incompatible with literal magic.
>  
> +label:<white space separated list>;;
> +	Additionally to matching the pathspec, the path must habe a 'label'
> +	attribute having set all labels listed here.

s/habe/have/

> +	if (ATTR_TRUE(check.value))
> +		ret = 1; /* has all the labels */

So this is "pretend that I have all the labels under the sun".

> +	else if (ATTR_FALSE(check.value))
> +		ret = 0; /* has no labels */

I do not see a value in this.  What difference does it have compared
to having a "label=" that explicitly says "I do not have one"?  What
is your answer to an end-user question "When should I say 'label='
and when should I say '!label'?"

Just forbid it for now; we may find a better use for it later.

> +	else if (ATTR_UNSET(check.value))
> +		/*
> +		 * If no label was specified this matches, otherwise
> +		 * there is a missing label.
> +		 */
> +		ret = (required_labels->nr > 0) ? 0 : 1;

Hmm, I can see that this is making things more complicated to
explain and understand, but I cannot see what the expected use case
is.

Unless there is any compelling use case, I'd say we should forbid it
for now.

> +	else {
> +		struct string_list_item *si;
> +		struct string_list attr_labels = STRING_LIST_INIT_DUP;
> +		string_list_split(&attr_labels, check.value, ',', -1);
> +		string_list_sort(&attr_labels);

Filter out a non-compliant label values here, so that they are
ignored from day one.  That way you would not have to deal with "I
know I got the warning, but it used to work and you broke it" later.

> +		ret = 1;
> +		for_each_string_list_item(si, required_labels) {
> +			if (!string_list_has_string(&attr_labels, si->string)) {
> +				ret = 0;
> +				break;
> +			}
> +		}
> +		string_list_clear(&attr_labels, 0);
> +	}
> +
> +	return ret;
> +}

> +static void validate_label_name(const char *label)
> +{
> +	if (check_valid_label_name(label))
> +		warning(_("Label '%s' discouraged. Recommended label names start "
> +			"with an alphabetic character and use only alphanumeric "
> +			"characters, dashes and underscores."), label);
> +}

Instead of returning void, parrot the return value from
check_valid_label_name().

> +		/* The label token may have no argument, so no trailing ':' */

Why close the door for future pathspec magic "labelfoo" by being
unnecessarily[*1*] lenient?

    [*1*] Does ":(label)" mean anything useful, and is there a good
          reason why it should behave differently from ":(label:)"?

Unless there is a good reason why users would want ":(label)", I'd
say we should make it a syntax error.

> +		if (skip_prefix(copyfrom, "label", &body)) {
> +			struct strbuf sb = STRBUF_INIT;
> +			skip_prefix(body, ":", &body);
> +			strbuf_add(&sb, body, nextat - body);
> +			if (!item->labels) {
> +				item->labels = xmalloc(sizeof(*item->labels));
> +				string_list_init(item->labels, 1);
> +			} else
> +				die(_("multiple labels not supported in pathspec magic"));

I am NOT suggesting to make this enhancement in the prototype to
allow us experiment with submodule selection use case, but this is
an obvious place to allow

	:(label=A B):(label=C D)

to mean ((A & B) | (C & D)) by making item->labels an array of set
of labels.

And no, I do not think arbitrary boolean expression is too complex
to understand to the end-users, especially if we have to stay within
the pathspec magic syntax.  And my gut feeling is that it is not
worth it to support anything more complex than "OR of these ANDed
ones".

> +			string_list_split(item->labels, sb.buf, ' ', -1);
> +			string_list_remove_empty_items(item->labels, 0);
> +			strbuf_release(&sb);
> +			continue;

The data structure to record the "required labels" is shared
knowledge between this function and the has_all_labels() and nobody
else knows it is done with string_list, so I think this is a good
balance between expediency and future optimization possibilities (I
am anticipating that linear search of string list would be found as
performance bottleneck).

> @@ -447,6 +490,12 @@ void parse_pathspec(struct pathspec *pathspec,
>  		if (item[i].nowildcard_len < item[i].len)
>  			pathspec->has_wildcard = 1;
>  		pathspec->magic |= item[i].magic;
> +
> +		if (item[i].labels) {
> +			struct string_list_item *si;
> +			for_each_string_list_item(si, item[i].labels)
> +				validate_label_name(si->string);

Reject a request to use a label that does not validate here.

The series is surprisingly simple and was a pleasant read (I no
longer recall how this compares with the earlier change to do this
as [submodule "x"] label=X).  Especially, given that the first two
patches are unrelated clean-ups that can be split out, this now
consists of merely 3 patches, each of which is an easily digestable
size.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]