Re: [RFC-PATCHv6 4/4] pathspec: allow querying for attributes

Junio C Hamano <gitster@xxxxxxxxx> · Wed, 18 May 2016 08:39:06 -0700

Junio C Hamano <gitster@xxxxxxxxx> writes:

> The mention of the possibility is purely as a hint useful for a
> possible enhancement in the far future.  If we ever want to support
> something like this:
>
> 	":(attr-expression (VAR1=VAL1 | VAR1=VAL2) & VAR2)"
>
> you can remember that you can put VAR1 and VAR2 in attr_check to
> grab values for VAR1 and VAR2 (even though VAR1 is mentioned twice
> in the expression), and use them in the evaluation you will perform.
>
>> So for the matching we would need to get the order right, i.e.
>>
>>     const char *inspect_name = git_attr_name(item.attr_match[i].attr);
>>     for (j=0; j <  p.attr_check.check_nr; j++) {
>>         const char *cur_name = git_attr_name(p.attr_check.check[j].attr);
>>         if (!strcmp(inspect_name, cur_name))
>>             break;
>
> You do not strcmp() when you have attributes.  They are interned so
> that you can compare their addresses.  That makes it somewhat
> cheaper.
>
> Once you start "expression over random attributes", you'd need to
> map attr => value somehow.  The format attr_check structure gives
> you, i.e. a list of <attr, value>, is aimed at compactness than
> random hashmap-like access.  If the caller wants a hashmap-like
> access for performance purposes, the caller does that itself.

To expand this a bit, I actually do not think hashmap-like access is
necessary even in such an application.

An implementation of the evaluator, at least a production-quality
one, for the attr expression example shown above is unlikely to keep
the expression in a single string "(VAR1=VAL1 | VAR1=VAL2) & VAR2".
Instead it would use a parse tree with nodes that represent
operators (e.g. OR, AND, "=", etc.)  and terms (e.g. attribute
reference VAR1, VAR2, and constatnts "VAL1", "VAL2") as its internal
representation.

And a node in such a parse tree that refers to an attribute VAR1
would unlikely to keep a "const char *" that is "VAR1".  The node
would have a field that stores an index into the attr_check.check[]
array when the textual expression is "compiled" into a parse tree
and the set of attributes the expression uses (hence it needs to
ask the attributes API about, to prepare attr_check array) is
identified.

When "evaluating" the parse-tree, a node that refers to an attribute
has an index into attr_check.check[] array, so there is no need for
a loop like the one shown above at all.

When "showing" the expression (for debugging purposes), it would
grab the "index into the attr_check.check[] array" from the node,
and the element in that array is an <attr, value> pair, so it can
use git_attr_name() to obtain the attribute name.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html