On Wed, Dec 13, 2023 at 08:07:21AM +0100, Patrick Steinhardt wrote: > > static const char *unquote_quoted_string(struct strbuf *outbuf, const char *in) > > { > > - int c; > > int take_next_literally = 0; > > > > - while ((c = *in++) != 0) { > > + while (*in) { > > + int c = *in++; > > if (take_next_literally == 1) { > > take_next_literally = 0; > > } else { > > I was wondering whether we want to convert `unquote_quoted_pair()` in > the same way. It's not subject to the same issue because it doesn't > return `in` to its callers. But it would lower the cognitive overhead a > bit by keeping the coding style consistent. But please feel free to > ignore this remark. I dunno. I don't think the consistency matters that much between functions, and on its own, the "while ((c = *in++))" pattern is not harmful. It's used elsewhere in the same file for other functions that are also fine (for decoding q/b headers). > Another thing I was wondering about is the recursive nature of these > functions, and whether it can lead to similar issues like we recently > had when parsing revisions with stack exhaustion. I _think_ this should > not be much of a problem in this case though, as we're talking about > mail headers here. While the length of header values isn't restricted > per se (the line length is restricted to 1000, but with Comment Folding > Whitespace values can span multiple lines), but mail provdiers will sure > clamp down on mails with a "From:" header that is megabytes in size. It's just unquote_comment() that is recursive, but yeah, there is nothing to stop it from recursing forever on "((((((...". The stack requirements are pretty small, though. I needed between 2^17 and 2^18 bytes to segfault on my machine using: perl -e 'print "From: ", "(" x 2**18;' | git mailinfo /dev/null /dev/null Absurdly big for an email, but maybe within the realm of possibility? I think it might be possible to drop the recursion and just use a depth counter, like this: diff --git a/mailinfo.c b/mailinfo.c index 737b9e5e13..db236f9f9f 100644 --- a/mailinfo.c +++ b/mailinfo.c @@ -59,6 +59,7 @@ static void parse_bogus_from(struct mailinfo *mi, const struct strbuf *line) static const char *unquote_comment(struct strbuf *outbuf, const char *in) { int take_next_literally = 0; + int depth = 1; strbuf_addch(outbuf, '('); @@ -72,11 +73,14 @@ static const char *unquote_comment(struct strbuf *outbuf, const char *in) take_next_literally = 1; continue; case '(': - in = unquote_comment(outbuf, in); + strbuf_addch(outbuf, '('); + depth++; continue; case ')': strbuf_addch(outbuf, ')'); - return in; + if (!--depth) + return in; + continue; } } I doubt it's a big deal either way, but if it's that easy to do it might be worth it. -Peff