Re: [PATCH v3 00/15] ref-filter: use parsing functions

Junio C Hamano <gitster@xxxxxxxxx> · Thu, 07 Jan 2016 13:28:35 -0800

Karthik Nayak <karthik.188@xxxxxxxxx> writes:

> So we something like this for the parsing function:
>
>  int parse_ref_filter_atom(const char *atom, const char *ep)
>  {
>         const char *sp;
> +       char *arg;

I think this and the new parameter to .parser() function should be
"const char *".

> @@ -141,6 +143,7 @@ int parse_ref_filter_atom(const char *atom, const char *ep)
>                 const char *formatp = strchr(sp, ':');
>                 if (!formatp || ep < formatp)
>                         formatp = ep;
> +               arg = (char *)formatp;
>                 if (len == formatp - sp && !memcmp(valid_atom[i].name, sp, len))
>                         break;

And this part can just use arg without formatp.  The original is a
bit sloppy and will keep looking for a colon past ep, but we already
know between sp and ep there is no NUL, so we could do this:

		arg = memchr(sp, ':', ep - sp);
		if ((!arg || len == arg - sp) &&
		    !memcmp(valid_atom[i].name, sp, len))
			break;

> @@ -154,6 +157,13 @@ int parse_ref_filter_atom(const char *atom, const char *ep)
>         REALLOC_ARRAY(used_atom, used_atom_cnt);
>         used_atom[at].name = xmemdupz(atom, ep - atom);
>         used_atom[at].type = valid_atom[i].cmp_type;
> +       if (arg != ep)
> +               arg = xstrndup(arg + 1, ep - arg - 1);
> +       else
> +               arg = NULL;

Why even copy?  The original that used match_atom_name() borrowed
part of existing string via (const char **val), so you know whatever
used that &buf you grabbed out of match_atom_name() should only be
reading the values not writing into the memory, no?

That is why I think arg should be "const char *".

As the above memchr() alrady took care of "we didn't find a colon"
case, we only need to do this here, I think:

	if (arg)
        	arg = used_atom[at].name + (arg - atom);

and without later free().

Alternatively, we could add an int field to elements of used_atom[]
array that says what byte-offset in the used_atom[].name the atom
arguments start (if any).  Then .parser() does not have to take the
parameter separately [*1*].

> +       if (valid_atom[i].parser)
> +               valid_atom[i].parser(&used_atom[at], arg);
> +       free(arg);
>         if (*atom == '*')
>                 need_tagged = 1;
>         if (!strcmp(used_atom[at].name, "symref"))

[Footnote]

*1* Thinking about it more, perhaps used_atom[].type should be
removed and instead used_atom[].atom should be a pointer into the
valid_atom[] array.  Then any reference to used_atom[].type will
become used_atom[].atom->cmp_type, which is much nicer for two
reasons: (1) one less useless copy (2) one less field that has a
name "type" that is overly generic.

That does not remove the need for recording where the atom argument
is, though, in used_atom[].  We could add a bit "has_deref" to
used_atom[] and then do something like this:

    arg = used_atom[i].name + used_atom[i].atom->namelen +
          used_atom[i].has_deref;

but I do not think we want to go there.  It would hardcode the
knowledge that used_atom[i].name is either used_atom[i].atom->name
or one asterisk prefixed to it, making future extension of the
syntax even harder.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html