[cc:+junio] On Thu, Jun 2, 2016 at 3:51 AM, Eric Wong <e@xxxxxxxxx> wrote: > Eric Wong <e@xxxxxxxxx> wrote: >> Eric Sunshine <sunshine@xxxxxxxxxxxxxx> wrote: >> > On Tue, May 31, 2016 at 3:45 AM, Eric Wong <e@xxxxxxxxx> wrote: >> > > Eric Sunshine <sunshine@xxxxxxxxxxxxxx> wrote: >> > >> I wonder if hand-coding, rather than using a regex, could be an improvement: >> > >> >> > >> static int is_mboxrd_from(const char *s, size_t n) >> > >> { >> > >> size_t f = strlen("From "); >> > >> const char *t = s + n; >> > >> >> > >> while (s < t && *s == '>') >> > >> s++; >> > >> return t - s >= f && !memcmp(s, "From ", f); >> > >> } >> > >> >> > >> or something. >> > > >> > > Yikes. I mostly work in high-level languages and do my best to >> > > avoid string parsing in C; so that scares me. A lot. >> > >> > The hand-coded is_mboxrd_from() above is pretty much idiomatic C and >> > (I think) typical of how such a function would be coded in Git itself, >> > so it looks normal and easy to grok to me (but, of course, I'm >> > probably biased since I wrote it). > > For reference, here is the gfrom function from qmail (gfrom.c, > source package netqmail-1.06 in Debian, reformatted git style) > > int gfrom(char *s, int len) > { > while ((len > 0) && (*s == '>')) { > ++s; > --len; > } > > return (len >= 5) && !str_diffn(s, "From ", 5); > } Seems less idiomatic and less like what we might see elsewhere in the Git codebase, but that's subjective. Functionally, it appears correct. > Similar to yours, but a several small things improves > readability for me: > > * the avoidance of subtraction from the "return" conditional > * s/n/len/ variable name Idiomatic C code favors concise names such as 'i', 'j', or 'n', for instance, but I don't care strongly. > * extra parentheses Unnecessary syntactic noise (consuming reviewer brain cycles). > * removal of "t" variable (t for "terminal/termination"?) Heh, no, just the next letter after 's'. Again, just an idiom, as 'i', 'j', 'k' are often used for integers, 's' and 't' are common for strings. > str_diffn is memcmp-like, I assume. My eyes glazed over > when I saw that function implemented in str_diffn.c, too. > > Just thinking out loud, with sufficient tests I could go with > either. Will reroll when/if I get the chance tomorrow. As mentioned above, it's all subjective and, of course, I have a bias toward the example I provided, but don't otherwise feel strongly about it. I do, however, like the idea of using a simple hand-coded matching function over the regex (but no so much that I would complain about it). Use whatever you and Junio feel is appropriate. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html