Eric Sunshine <sunshine@xxxxxxxxxxxxxx> wrote: > On Tue, May 31, 2016 at 3:45 AM, Eric Wong <e@xxxxxxxxx> wrote: > > Eric Sunshine <sunshine@xxxxxxxxxxxxxx> wrote: <snip> Ah thanks, I can see your point-of-view, now. I always had the '^' in my mind since I've written the same thing in Perl and Ruby. > >> I wonder if hand-coding, rather than using a regex, could be an improvement: > >> > >> static int is_mboxrd_from(const char *s, size_t n) > >> { > >> size_t f = strlen("From "); > >> const char *t = s + n; > >> > >> while (s < t && *s == '>') > >> s++; > >> return t - s >= f && !memcmp(s, "From ", f); > >> } > >> > >> or something. > > > > Yikes. I mostly work in high-level languages and do my best to > > avoid string parsing in C; so that scares me. A lot. > > The hand-coded is_mboxrd_from() above is pretty much idiomatic C and > (I think) typical of how such a function would be coded in Git itself, > so it looks normal and easy to grok to me (but, of course, I'm > probably biased since I wrote it). > > > I admit a regex isn't necessary, but I'm wondering if the above > > could be made less frightening to someone like me. > > Perhaps, but it's difficult to say without knowing how it frightens you. Pointer arithmetic leading to buffer overruns; but yeah, I don't do string parsing in C often, if ever. > > Maybe extra test cases + valgrind can quell my fears :) > > I can envision tests such as: > > "" > "F" > "From" > "From " > "From " > "From foobar" > > and so on, if that's what you mean. Yes, I also noticed trailing spaces are dropped anyways, so there's no perfect round-tripping going on. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html