Re: [RFC-PATCHv6 4/4] pathspec: allow querying for attributes

Junio C Hamano <gitster@xxxxxxxxx> · Tue, 17 May 2016 13:25:58 -0700

Stefan Beller <sbeller@xxxxxxxxxx> writes:

> On Mon, May 16, 2016 at 10:03 PM, Junio C Hamano <gitster@xxxxxxxxx> wrote:
>> When matching (i.e. the match_attrs() function), you would instead
>> do
>>
>>         path = xmemdupz(name, namelen);
>>         git_check_attr(path, item->attr_check);
>>
>> to grab values for only attributes that matter to you, instead of
>> calling git_all_attrs() [*2*].
>>
>> After git_check_attr() returns, item->attr_check.check[0].attr would
>> be git_attr("VAR1") and item->attr_check.check[0].value would be
>> whatever setting the path has for the VAR1 attribute.  You can use
>> your match_mode logic to compare it with the values .attr_match
>> expects.
>>
>> You do not necessarily have to have the same number of elements in
>> .attr_match and .attr_check.check by the way.  .attr_match might say
>>
>>         VAR1=VAL2 !VAR1 -VAR1
>>
>> which may be always false if these are ANDed together, but in order
>> to evaluate it, you need only one git_attr_check_elem for VAR1.
>

The key phrase in the message you are reacting to is "not
necessarily".  It is not a crime to ask for the same attribute twice
in a git_attr_check structure.

    $ git check-attr text text text -- path

would stuff three instances of "text" in there and ask them for
"path".  The simple in-code callers that uses git_attr_check_initl()
do rely on the order of the attributes it placed in attr_check
structure (see e.g. how ll_merge() uses check[0].value and
check[1].value to see the driver name and marker size), and that is
perfectly kosher.  Existing code is your friend.

The mention of the possibility is purely as a hint useful for a
possible enhancement in the far future.  If we ever want to support
something like this:

	":(attr-expression (VAR1=VAL1 | VAR1=VAL2) & VAR2)"

you can remember that you can put VAR1 and VAR2 in attr_check to
grab values for VAR1 and VAR2 (even though VAR1 is mentioned twice
in the expression), and use them in the evaluation you will perform.

> So for the matching we would need to get the order right, i.e.
>
>     const char *inspect_name = git_attr_name(item.attr_match[i].attr);
>     for (j=0; j <  p.attr_check.check_nr; j++) {
>         const char *cur_name = git_attr_name(p.attr_check.check[j].attr);
>         if (!strcmp(inspect_name, cur_name))
>             break;

You do not strcmp() when you have attributes.  They are interned so
that you can compare their addresses.  That makes it somewhat
cheaper.

Once you start "expression over random attributes", you'd need to
map attr => value somehow.  The format attr_check structure gives
you, i.e. a list of <attr, value>, is aimed at compactness than
random hashmap-like access.  If the caller wants a hashmap-like
access for performance purposes, the caller does that itself.

Existing users do not need a hashmap-like access, because they know
at which index in attr_check they placed request for what attribute.
An array that can be indexed with a small integer is exactly what
they want.

> This doesn't look cheap to me? Am I holding it wrong again?

By the way, I do not think during the entire discussion on this
topic, you have never been in a situation to deserve the "holding it
wrong" label (which implies "a ware is broken, but somehow the end
user is blamed for using it incorrectly").  When you were wrong, you
were simply wrong.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html