Re: [PATCH v4 05/12] ref-filter: introduce parsing functions for each valid atom

Eric Sunshine <sunshine@xxxxxxxxxxxxxx> · Sun, 7 Feb 2016 02:03:19 -0500

On Sat, Feb 6, 2016 at 9:36 AM, Karthik Nayak <karthik.188@xxxxxxxxx> wrote:
> On Thu, Feb 4, 2016 at 3:49 AM, Eric Sunshine <sunshine@xxxxxxxxxxxxxx> wrote:
>> On Sun, Jan 31, 2016 at 12:42 PM, Karthik Nayak <karthik.188@xxxxxxxxx> wrote:
>>> -               const char *formatp = strchr(sp, ':');
>>> -               if (!formatp || ep < formatp)
>>> -                       formatp = ep;
>>> -               if (len == formatp - sp && !memcmp(valid_atom[i].name, sp, len))
>>> +               arg = memchr(sp, ':', ep - sp);
>>
>> Why this change from strchr() to memchr()? I understand that you're
>> taking advantage of the fact that you know the extent of the string
>> via 'sp' and 'ep', however, was the original strchr() doing extra
>> work? Even if this change is desirable, it seems somewhat unrelated to
>> the overall purpose of this patch, thus might deserves its own.
>>
>> Aside from that, although the "expensive" strchr() / memchr() resides
>> within the loop, it will always return the same value since it doesn't
>> depend upon any condition local to the loop. This implies that it
>> ought to be hoisted out of the loop. (This problem is not new to this
>> patch; it's already present in the existing code.)
>
> I'm thinking I'll make a patch for that separately. i.e remove strchr()
> and introduce memchr() outside the loop.

I'd almost suggest making it two patches: (1) change strchr() to
memchr(), and (2) hoist it outside the loop. However, it would be nice
to see this series land with v5, and adding more refactoring patches
could delay its landing if problems are found with those new patches.
Consequently, it might make sense to forego any additional refactoring
for now and just keep the patch as is, except for fixing the
relatively minor issues (and bug or two) raised in the v4 review. If
you take that approach, then hoisting memchr() out of the loop can be
done as a follow-up patch after this series lands.

>>> +               if ((!arg || len == arg - sp) &&
>>> +                   !memcmp(valid_atom[i].name, sp, len))
>>>                         break;
>>>         }
>>>
>>> @@ -154,6 +155,10 @@ int parse_ref_filter_atom(const char *atom, const char *ep)
>>>         REALLOC_ARRAY(used_atom, used_atom_cnt);
>>>         used_atom[at].name = xmemdupz(atom, ep - atom);
>>>         used_atom[at].type = valid_atom[i].cmp_type;
>>> +       if (arg)
>>> +               arg = used_atom[at].name + (arg - atom) + 1;
>>
>> This is a harder to understand than it ought to be because it's
>> difficult to tell at first glance that you don't actually care about
>> 'arg' in relation to the original incoming string, but instead use it
>> only to compute an offset into the string which is ultimately stored
>> in the newly allocated used_atom[]. Re-using 'arg' for a different
>> purpose (in a manner of speaking) confuses the issue further.
>>
>> The intention might be easier to follow if you made it clear that you
>> were interested in the *offset* of the argument in the string, rather
>> than a pointer into the incoming string which you ultimately don't
>> use. A variable named 'arg_offset' might go a long way toward
>> clarifying this intention.
>
> I hope you mean something like this.
>
> if (arg) {
>     int arg_offset;
>
>     arg_offset = (arg - atom) + 1;
>     arg = used_atom[at].name + arg_offset;
> }

That's one way, but I was actually thinking about computing arg_offset
earlier in the "is it a valid atom?" loop, which would make it clear
that you care only about the offset at that point, rather than the
pointer to the ':' in the original string (since that pointer is never
itself used other than to compute the offset). However, having tried
it myself, the code ends up being nosier, thus not necessarily a win,
so maybe just leave this as is for now.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html