Eric Sunshine <sunshine@xxxxxxxxxxxxxx> wrote: > On Mon, May 30, 2016 at 7:21 PM, Eric Wong <e@xxxxxxxxx> wrote: > > diff --git a/pretty.c b/pretty.c > > @@ -1697,12 +1699,34 @@ static void pp_handle_indent(struct pretty_print_context *pp, > > +static regex_t *mboxrd_prepare(void) > > +{ > > + static regex_t preg; > > + const char re[] = "^>*From "; > > + int err = regcomp(&preg, re, REG_NOSUB | REG_EXTENDED); > > +[...] > > + return &preg; > > +} > > + > > void pp_remainder(struct pretty_print_context *pp, > > const char **msg_p, > > struct strbuf *sb, > > int indent) > > { > > + static regex_t *mboxrd_from; > > + > > + if (pp->fmt == CMIT_FMT_MBOXRD && !mboxrd_from) > > + mboxrd_from = mboxrd_prepare(); > > + > > @@ -1725,8 +1749,13 @@ void pp_remainder(struct pretty_print_context *pp, > > else if (pp->expand_tabs_in_log) > > strbuf_add_tabexpand(sb, pp->expand_tabs_in_log, > > line, linelen); > > - else > > + else { > > + if (pp->fmt == CMIT_FMT_MBOXRD && > > + !regexec(mboxrd_from, line, 0, 0, 0)) > > + strbuf_addch(sb, '>'); > > At first glance, this seems dangerous since it's handing 'line' to > regexec() without paying attention to 'linelen'. For an arbitrary > regex, this could result in erroneous matches on subsequent "lines", > however, since this expression is anchored with '^', it's not a > problem. But, it is a bit subtle. Maybe having more context of the pp_remainder function and seeing the get_one_line call would've helped in the diff; I didn't think of this issue once I figured out where to place the change. On the other hand, not being too familiar with git C APIs, I was more worried strbuf was not NUL-terminated for regexec, but it seems to be. > I wonder if hand-coding, rather than using a regex, could be an improvement: > > static int is_mboxrd_from(const char *s, size_t n) > { > size_t f = strlen("From "); > const char *t = s + n; > > while (s < t && *s == '>') > s++; > return t - s >= f && !memcmp(s, "From ", f); > } > > or something. Yikes. I mostly work in high-level languages and do my best to avoid string parsing in C; so that scares me. A lot. I admit a regex isn't necessary, but I'm wondering if the above could be made less frightening to someone like me. Or maybe I'm just easily frightened :x Maybe extra test cases + valgrind can quell my fears :) > > diff --git a/t/t4014-format-patch.sh b/t/t4014-format-patch.sh > > @@ -1565,4 +1565,31 @@ test_expect_success 'format-patch --base overrides format.useAutoBase' ' > > +test_expect_success 'format-patch --pretty=mboxrd' ' > > + cat >msg <<-INPUT_END && > > Maybe use <<-\INPUT_END to indicate that no variable interpolation is > expected. Ditto below. Noted, though I may add variable interpolation to test a line with trailing whitespace (just "From ") to ensure we don't overrun a buffer. > Hmm, -A is not POSIX and is otherwise not used in Git tests. Perhaps > you could use 'git grep --no-index -A' instead or something? Noted and will fix for v2. Will wait a day or two for further comments on this series. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html