Re: [PATCH/RFC 00/10] ref-filter: use parsing functions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Dec 11, 2015 at 5:49 PM, Junio C Hamano <gitster@xxxxxxxxx> wrote:
> Karthik Nayak <karthik.188@xxxxxxxxx> writes:
>>   ref-filter: introduce a parsing function for each atom in valid_atom
>>   ref-filter: introduce struct used_atom
>>   ref-fitler: bump match_atom() name to the top
>>   ref-filter: skip deref specifier in match_atom_name()
>>   ref-filter: introduce color_atom_parser()
>>   strbuf: introduce strbuf_split_str_without_term()
>>   ref-filter: introduce align_atom_parser()
>>   ref-filter: introduce remote_ref_atom_parser()
>>   ref-filter: introduce contents_atom_parser()
>>   ref-filter: introduce objectname_atom_parser()
>
> It seems that this series had seen quite a good inputs, mostly from
> Eric.  I finished reading it over and I didn't find anything more to
> add.  The patches are mostly good and would be ready once these
> points raised during the review are addressed, I think

I'm still a bit fuzzy about what this series is trying to achieve. It
feels like it wants to avoid doing repeated processing of unchanging
bits of %(foo:bar) atoms for each ref processed, but it only partly
achieves that goal. For instance, there are still an awful lot of
strcmp()s and starts_with()s in that inner loop, and even the
unchanging %(color:) argument gets re-evaulated repeatedly, which is
probably quite expensive.

If the intention is to rid that inner loop of much of the expensive
processing, then wouldn't we want to introduce an enum of valid atoms
which is to be a member of 'struct used_atom', and have
populate_value() switch on the enum value rather than doing all the
expensive strcmp()s and starts_with()?

    enum atom_type {
        AT_REFNAME,
        AT_OBJECTTYPE,
        ...
        AT_COLOR,
        AT_ALIGN
    };

    static struct used_atom {
        enum atom_type atom;
        cmp_type cmp;
        union {
            char *color; /* parsed color */
            struct align align;
            enum { ... } remote_ref;
            struct {
                enum { ... } portion;
                unsigned int nlines;
            } contents;
            int short_objname;
        } u;
    } *used_atom;

In fact, the 'cmp_type cmp' field can be dropped altogether since it
can just as easily be looked up when needed by keeping 'enum
atom_type' and valid_atoms[] in-sync and indexing into valid_atoms[]
by atom_type:

    struct used_atom *a = ...;
    cmp_type cmp = valid_atoms[a->atom].cmp_type;

As a bonus, an 'enum atom_type' would resolve the problem of how
starts_with() is abused in populate_value() for certain cases
(assuming I'm understanding the logic), such as how matching of
"color" could incorrectly match some yet-to-be-added atom named
"colorize".
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]