Re: [RFC-PATCHv6 4/4] pathspec: allow querying for attributes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Stefan Beller <sbeller@xxxxxxxxxx> writes:

> +static struct git_attr_check *check;
> +static int match_attrs(const char *name, int namelen,
> +		       const struct pathspec_item *item)
> +{
> +	char *path;
> +	int i;
> +
> +	if (!check) {
> +		check = git_attr_check_alloc();
> +		for (i = 0; i < item->attr_nr; i++)
> +			git_attr_check_append(check, item->attrs[i].attr);

This is simply wrong; you may have two pathspec elements with
attribute match magic, the first one may ask for one attribute while
the second one may ask for seven.  The first time around you
allocate and append one attribute.  The second time around you don't
do anything useful, and send a git_attr_check with one element to
deal with 7 attributes.

> +	}
> +	path = xmemdupz(name, namelen);
> +	git_all_attrs(path, check);

However, the above "This is simply wrong" bogosity is covered
because git_all_attrs() is used here, ignoring what is in check.

The loop we see above is an expensive no-op, as the first thing
all_attrs() does is to empty check() and instead stuff it with every
attribute under the sun, not necessarily limited to attributes in
item->attrs[].

By the way, do not call an array as plural.  item->attr[i] is a good
name to call a single ith element in an array.  item->attrs[i] isn't.

> +	for (i = 0; i < item->attr_nr; i++) {
> +		int matched;
> +		const char *value = check->check[i].value;

check[i] has no relevance to item->attrs[i] here.  I do not think
the code after this point is computing anything sensible.

> diff --git a/pathspec.h b/pathspec.h
> index 0c11262..89d73db 100644
> --- a/pathspec.h
> +++ b/pathspec.h
> @@ -32,6 +32,21 @@ struct pathspec {
>  		int len, prefix;
>  		int nowildcard_len;
>  		int flags;
> +		int attr_nr;
> +		int attr_alloc;
> +		struct attr_item {
> +			char *attr;
> +			char *value;
> +			enum attr_match_mode {
> +				NOT_INIT,
> +				MATCH_SET,
> +				MATCH_UNSET,
> +				MATCH_VALUE,
> +				MATCH_UNSPECIFIED,
> +				MATCH_NOT_UNSPECIFIED,
> +				INVALID_ATTR
> +			} mode;
> +		} *attrs;

I'd think the above addition that is in line with the updated API
would look more like this [*1*]:

	int attr_match_nr;
        int attr_match_alloc;
        struct attr_match {
        	struct git_attr *attr;
                char *value;
                enum attr_match_mode {
                	...
		} match_mode;
	} *attr_match;
	struct git_attr_check *attr_check;

Then while parsing ":(attr:VAR1=VAL1 -VAR2 VAR3...)path/to/dir/",
you would first do:

	p->attr_check = git_attr_check_alloc();

once, and then send each of VAR1=VAL2, -VAR2, VAR3... to your
parse_one_item() helper function which would:

 * parse the match-mode like your code does;

 * parse out the attribute name (i.e. VAR1, VAR2 and VAR3), and
   instead of keeping it as a "(const) char *", call git_attr()
   to intern it (and keep it in local variable "attr"), and save
   it in p->attr_match[p->attr_nr].attr;

 * call git_attr_check_append(p->attr_check, git_attr_name(attr))

After the above finishes, you would end up with something like:

        .attr_match = {
            { .attr = git_attr("VAR1"), .value = "VAL2",
              .match_mode = MATCH_VALUE },
            { .attr = git_attr("VAR2"), .value = <does not matter>,
              .match_mode = MATCH_UNSET },
            ...
	},
        .attr_check = {
	    .check = {
            	{ .attr = git_attr("VAR1"), .value = <does not matter> },
            	{ .attr = git_attr("VAR2"), .value = <does not matter> },
            	{ .attr = git_attr("VAR3"), .value = <does not matter> },
            }
	    
When matching (i.e. the match_attrs() function), you would instead
do

	path = xmemdupz(name, namelen);
	git_check_attr(path, item->attr_check);

to grab values for only attributes that matter to you, instead of
calling git_all_attrs() [*2*].

After git_check_attr() returns, item->attr_check.check[0].attr would
be git_attr("VAR1") and item->attr_check.check[0].value would be
whatever setting the path has for the VAR1 attribute.  You can use
your match_mode logic to compare it with the values .attr_match
expects.

You do not necessarily have to have the same number of elements in
.attr_match and .attr_check.check by the way.  .attr_match might say

	VAR1=VAL2 !VAR1 -VAR1

which may be always false if these are ANDed together, but in order
to evaluate it, you need only one git_attr_check_elem for VAR1.


[Footnote]

*1* With the old API, things would not be that much different.
    Instead of single structure .attr_check, you would make an
    array of git_attr_check structure, exactly like the array
    at .attr_check.check[] in the new API by hand.  The new API
    makes this preparation simpler by managing the array on the API
    implementation side.

*2* Please do not use that silly function especially in performance
    sensitive codepath.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]