On Sun, Dec 13, 2015 at 11:10 AM, Eric Sunshine <sunshine@xxxxxxxxxxxxxx> wrote: > On Fri, Dec 11, 2015 at 5:49 PM, Junio C Hamano <gitster@xxxxxxxxx> wrote: >> Karthik Nayak <karthik.188@xxxxxxxxx> writes: >>> ref-filter: introduce a parsing function for each atom in valid_atom >>> ref-filter: introduce struct used_atom >>> ref-fitler: bump match_atom() name to the top >>> ref-filter: skip deref specifier in match_atom_name() >>> ref-filter: introduce color_atom_parser() >>> strbuf: introduce strbuf_split_str_without_term() >>> ref-filter: introduce align_atom_parser() >>> ref-filter: introduce remote_ref_atom_parser() >>> ref-filter: introduce contents_atom_parser() >>> ref-filter: introduce objectname_atom_parser() >> >> It seems that this series had seen quite a good inputs, mostly from >> Eric. I finished reading it over and I didn't find anything more to >> add. The patches are mostly good and would be ready once these >> points raised during the review are addressed, I think > > I'm still a bit fuzzy about what this series is trying to achieve. It > feels like it wants to avoid doing repeated processing of unchanging > bits of %(foo:bar) atoms for each ref processed, but it only partly > achieves that goal. For instance, there are still an awful lot of > strcmp()s and starts_with()s in that inner loop, and even the > unchanging %(color:) argument gets re-evaulated repeatedly, which is > probably quite expensive. > Yes, you're right. It started off as per Junio's suggestion which you can find here: http://thread.gmane.org/gmane.comp.version-control.git/279254 > If the intention is to rid that inner loop of much of the expensive > processing, then wouldn't we want to introduce an enum of valid atoms > which is to be a member of 'struct used_atom', and have > populate_value() switch on the enum value rather than doing all the > expensive strcmp()s and starts_with()? > > enum atom_type { > AT_REFNAME, > AT_OBJECTTYPE, > ... > AT_COLOR, > AT_ALIGN > }; > > static struct used_atom { > enum atom_type atom; > cmp_type cmp; > union { > char *color; /* parsed color */ > struct align align; > enum { ... } remote_ref; > struct { > enum { ... } portion; > unsigned int nlines; > } contents; > int short_objname; > } u; > } *used_atom; > > In fact, the 'cmp_type cmp' field can be dropped altogether since it > can just as easily be looked up when needed by keeping 'enum > atom_type' and valid_atoms[] in-sync and indexing into valid_atoms[] > by atom_type: > > struct used_atom *a = ...; > cmp_type cmp = valid_atoms[a->atom].cmp_type; > > As a bonus, an 'enum atom_type' would resolve the problem of how > starts_with() is abused in populate_value() for certain cases > (assuming I'm understanding the logic), such as how matching of > "color" could incorrectly match some yet-to-be-added atom named > "colorize". This actually makes sense to me, especially how we can eliminate most of the starts_with() in populate_value(). Well then I guess I'll work on this and see what I can come up with, Thanks for the review :) -- Regards, Karthik Nayak -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html