Re: [PATCH v3 00/15] ref-filter: use parsing functions

Karthik Nayak <karthik.188@xxxxxxxxx> · Sat, 9 Jan 2016 14:30:22 +0530

On Fri, Jan 8, 2016 at 2:58 AM, Junio C Hamano <gitster@xxxxxxxxx> wrote:
> Karthik Nayak <karthik.188@xxxxxxxxx> writes:
>
>> So we something like this for the parsing function:
>>
>>  int parse_ref_filter_atom(const char *atom, const char *ep)
>>  {
>>         const char *sp;
>> +       char *arg;
>
> I think this and the new parameter to .parser() function should be
> "const char *".
>

Yes. Will do that.

>> @@ -141,6 +143,7 @@ int parse_ref_filter_atom(const char *atom, const char *ep)
>>                 const char *formatp = strchr(sp, ':');
>>                 if (!formatp || ep < formatp)
>>                         formatp = ep;
>> +               arg = (char *)formatp;
>>                 if (len == formatp - sp && !memcmp(valid_atom[i].name, sp, len))
>>                         break;
>
> And this part can just use arg without formatp.  The original is a
> bit sloppy and will keep looking for a colon past ep, but we already
> know between sp and ep there is no NUL, so we could do this:
>
>                 arg = memchr(sp, ':', ep - sp);
>                 if ((!arg || len == arg - sp) &&
>                     !memcmp(valid_atom[i].name, sp, len))
>                         break;
>

This even removes the need for formatp. Thanks

>> @@ -154,6 +157,13 @@ int parse_ref_filter_atom(const char *atom, const char *ep)
>>         REALLOC_ARRAY(used_atom, used_atom_cnt);
>>         used_atom[at].name = xmemdupz(atom, ep - atom);
>>         used_atom[at].type = valid_atom[i].cmp_type;
>> +       if (arg != ep)
>> +               arg = xstrndup(arg + 1, ep - arg - 1);
>> +       else
>> +               arg = NULL;
>
> Why even copy?  The original that used match_atom_name() borrowed
> part of existing string via (const char **val), so you know whatever
> used that &buf you grabbed out of match_atom_name() should only be
> reading the values not writing into the memory, no?
>

I totally missed that we could use the existing used_atom[at].name
to get the argument. Thanks

> That is why I think arg should be "const char *".
>
> As the above memchr() alrady took care of "we didn't find a colon"
> case, we only need to do this here, I think:
>
>         if (arg)
>                 arg = used_atom[at].name + (arg - atom);
>
> and without later free().
>
          if (arg)
                  arg = used_atom[at].name + (arg - atom) + 1;

'+ 1' to skip over the colon perhaps?

> Alternatively, we could add an int field to elements of used_atom[]
> array that says what byte-offset in the used_atom[].name the atom
> arguments start (if any).  Then .parser() does not have to take the
> parameter separately [*1*].
>
> [Footnote]
>
> *1* Thinking about it more, perhaps used_atom[].type should be
> removed and instead used_atom[].atom should be a pointer into the
> valid_atom[] array.  Then any reference to used_atom[].type will
> become used_atom[].atom->cmp_type, which is much nicer for two
> reasons: (1) one less useless copy (2) one less field that has a
> name "type" that is overly generic.
>
> That does not remove the need for recording where the atom argument
> is, though, in used_atom[].  We could add a bit "has_deref" to
> used_atom[] and then do something like this:
>
>     arg = used_atom[i].name + used_atom[i].atom->namelen +
>           used_atom[i].has_deref;
>
> but I do not think we want to go there.  It would hardcode the
> knowledge that used_atom[i].name is either used_atom[i].atom->name
> or one asterisk prefixed to it, making future extension of the
> syntax even harder.

Makes sense, to sum it up, we'll provide arguments to the parser function:

        void (*parser)(struct used_atom *atom, const char *arg);

So that we provide the arguments to the functions and ensure no need of parsing
atom->name.

Also ensure that we have separate parser functions for 'body',
'subject' and 'contents'.

-- 
Regards,
Karthik Nayak
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html