Re: [PATCHv7 5/5] pathspec: allow querying for attributes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, May 18, 2016 at 12:47 PM, Junio C Hamano <gitster@xxxxxxxxx> wrote:
> Stefan Beller <sbeller@xxxxxxxxxx> writes:
>
>> + - "`+`" the attribute must be set
>> + - "`-`" the attribute must be unset
>> + - "`~`" the attribute must be unspecified

Well the '-' is in line, at least.
So you're proposing empty string for set?

gitattributes documentation:
> Sometimes you would need to override an setting of an attribute for a
> path to Unspecified state. This can be done by listing the name of the
> attribute prefixed with an exclamation point !.

So the unspecified should be '!' to match Git consistency.

[I dislike the exclamation mark because of bash.
    git ls-files ":(attr:!text)"
    bash: !text: event not found
    git ls-files ":(attr:\!text)"
    fatal: attr spec '\!text': attrs must not start with '-' and be
composed of [-A-Za-z0-9_.].
    # we error out because of the \ in there
    git ls-files :\(attr:\!text\)
    ...
    # that actually works.
]

>
> I think using these three, not the way how .gitattributes specify
> them, is highly confusing to the users.

Ok I'll align those.

>
>> + - "`?`" the attribute must not be unspecified, i.e. set, unset or value matches
>
> The choice of '?' is OK, as there is no reason .gitattributes wants
> to say something fuzzy like "this is set or unset or has value".
> The last part "set, unset or value matches" does not make sense,
> though.  Did you mean "set, unset or set to value"?

Actually I meant "this is set or unset or has [any] value".

>
>> + - an empty "`[mode]`" matches if the attribute is set or has a value
>
> The same comment as +/-/~ above applies.

So consistency would ask for

 - an empty "`[mode]`" matches if the attribute is set
 - "`-`" the attribute must be unset
 - "`!`" the attribute must be unspecified
 - an empty "`[mode]`" combined with "`[=value]`" matches if the attribute has
   the given value.

and those are not yet codified, but for discussion then:

 - "`?`" the attribute must not be unspecified, i.e. set, unset or has any value
 - "`+`" the attribute must be set or has any value

>> +
>> +             match_mode = item->attr_match[i].match_mode;
>
> Mental note: there is a one-to-one correspondence between
> attr_check->check[] and attr_match[].

Right. I guess I should add a comment for that. Or better yet a test
with some weird pathspec making use of that

    git ls-files :(attr:labelA !labelB +labelA)

That way we would see if there will be confusion between the modes
and attr names in the future.

>
>> +             if (!matched)
>> +                     return 0;
>
> So this ANDs together.  OK.

Sure, (it's clear to me, maybe I should document that as well),
because multiple pathspecs are OR.

    git ls-files :(attr:labelA) :(attr:labelB)

>
> Mental note after skiming the caller:
>
> The "value" here is like "VAR1=VAL1 VAR2 !VAR3" in a pathspec
> ":(attr:VAR1=VAL1 VAR2 !VAR3)path/to/limit", i.e. the source string
> to be compiled into a list of conditionals that will be evaluated by
> match_attr() in dir.c

Right.

>> +     string_list_split(&list, value, ' ', -1);
>> +     string_list_remove_empty_items(&list, 0);

Right.

>
> At this point, we got "VAR1=VAL1", "VAR2", "!VAR3" as elements in
> this list.

Right.

>
>> +     if (!item->attr_check)
>> +             item->attr_check = git_attr_check_alloc();
>
> Given more than one "attr" magic, e.g. ":(attr:A=a,attr:B)/path",
> the check may not be empty when we process the second one; we just
> extend it without losing the existing contents.

That is why I am not super happy with it though.

    ":(attr:A=a,attr:B)/path",
    ":(attr:A=a B)/path",

are the same for the user as well as in the internal data structures.
This "wastes" the white space as a possible convenient separator
character, e.g. for multiple values. On the other hand it will be easier
to type, specially for many attrs (read submodule groups).

>
>> +     ALLOC_GROW(item->attr_match, item->attr_match_nr + list.nr, item->attr_match_alloc);
>
> Likewise, we extend attr_match[] array.

>> +
>> +             val_len = strcspn(val, "=,)");
>
> I understand "=", but can "," and ")" appear here?

This was overly caution from some intermediate state, where the caller
handed in more than required.

    if (skip_prefix(copyfrom, "attr:", &body)) {
        char *pass = xmemdupz(body, len - strlen("attr:"));
        parse_pathspec_attr_match(item, pass);
        free(pass);
        continue;
    }

When given ":(attr:VAR1=VAL1 VAR2 !VAR3)path/to/limit"
this only passes in "VAR1=VAL1 VAR2 !VAR3".

I contemplated write the caller as

    if (skip_prefix(copyfrom, "attr:", &body)) {
        parse_pathspec_attr_match(item, body);
        continue;
    }

but that would include further characters, i.e. ',' or ')'.
(but not a lot more)

I wanted to pass in an uncounted string though, as
string_list_separate doesn't offer a variant for counted
strings.

>
>> +             if (val[val_len] == '=') {
>> +                     am->match_mode = MATCH_VALUE;
>> +                     am->value = xstrdup(&val[val_len + 1]);
>> +                     /*
>> +                      * NEEDSWORK:
>> +                      * Do we want to allow escaped commas to search
>> +                      * for comma separated values?
>> +                      */
>
> No, we don't ;-).

ok will drop comment, ...

>
>> +                     if (strchr(am->value, '\\'))
>> +                             die(_("attr spec values must not contain backslashes"));
>
> But this is a good escape hatch to reserve for our own use, until we
> decide what the quoting mechanism will be (or if it is necessary).

but keep the code here

>
>> +             } else
>> +                     am->value = NULL;
>> +
>> +             if (invalid_attr_name(val, val_len)) {
>> +                     am->match_mode = INVALID_ATTR;
>> +                     goto err;
>> +             }
>> +
>> +             am->attr = git_attr(xmemdupz(val, val_len));
>> +             git_attr_check_append(item->attr_check, am->attr);
>
> GOOD!

except for the memory leak of xmemdupz. git_attr makes another
internal copy with FLEX_ALLOC_MEM. That ill be fixed once we use
git_attr_counted though.



>
> I think val and val_len should be renamed to attr and attr_len (in
> the VARIABLE=VALUE context, these two identifiers are about parsing
> the variable side, not the value side).

ok

>
>> +     }
>> +
>> +     string_list_clear(&list, 0);
>> +     return;
>> +err:
>> +     die(_("attr spec '%s': attrs must not start with '-' and "
>> +           "be composed of [-A-Za-z0-9_.]."), value);
>
> What is "value" at this point?  If you failed to parse the second
> element in "VAR1=VAL1 FR*TZ=VAL2 !VAR3", I think you would want to
> say "'FR*TZ' is malformed".
>
> Existing users of the function seems to say this:
>
>                 if (invalid_attr_name(cp, len)) {
>                         fprintf(stderr,
>                                 "%.*s is not a valid attribute name: %s:%d\n",
>                                 len, cp, src, lineno);
>                         return NULL;
>                 }
>
> when parsing .gitattribute file.  The source file:line reference
> does not apply to this context, but it would be better to unify the
> message.  I do not mind spelling the rules out explicitly like you
> did, but I do not want to see it spread across many places (which
> forces us to update them when we have to change the rule later).
>
> Perhaps we want a helper function in attr.c side that takes the
> attribute name string (val, val_len in your code above, which I
> suggest to be renamed to attr, attr_len) and formats the error
> message into a strbuf, or something?

That's why you added the reporting function to attrs, I see.

>
> Makes me wonder what "pass" stands for.  From the look of xmemdupz(),
> given ":(attr:VAR1=VAL1 VAR2 !VAR3)path/to/limit" to parse, len
> points at ")" and xmemdupz() gives "VAR1=VAL1 VAR2 !VAR3" to it.

"what we _pass_ to the parse_pathspec_attr_match"... Maybe attr_body?

Thanks,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]