On Mon, Sep 19, 2016 at 08:54:40PM +0200, Kevin Daudt wrote: > diff --git a/t/t5100/comment.expect b/t/t5100/comment.expect > new file mode 100644 > index 0000000..1197e76 > --- /dev/null > +++ b/t/t5100/comment.expect > @@ -0,0 +1,5 @@ > +Author: A U Thor (this is a comment (really)) Hmm. I don't see any recursion in your parsing, so after the first ")" our escape_context would be 0 again, right? So a more tricky test is: Author: A U Thor (this is a comment (really) with \(quoted\) pairs) We are still inside "ctext" when we hit those quoted pairs, and they should be unquoted, but your code would not do so (unless we go the route of simply unquoting pairs everywhere). I think your parser would have to follow the BNF more closely with a recursive descent parser, like: const char *parse_comment(const char *in, struct strbuf *out) { size_t orig_out = out->len; if ((in = parse_char('(', in, out))) && (in = parse_ccontent(in, out)) && (in = parse_char(')', in, out)))) return in; strbuf_setlen(out, orig_out); return NULL; } const char *parse_ccontent(const char *in, struct strbuf *out) { while (*in && *in != ')') { const char *next; if ((next = parse_quoted_pair(in, out)) || (next = parse_comment(in, out)) || (next = parse_ctext(in, out))) { in = next; continue; } } /* * if "in" is NUL here we have an unclosed comment; but we'll * just silently ignore and accept it */ return in; } const char *parse_char(char c, const char *in, struct strbuf *out) { if (*in != c) return NULL; strbuf_addch(out, c); return in + 1; } You can probably guess at the implementation of parse_quoted_pair(), parse_ctext(), etc (and naturally, the above is completely untested and probably has some bugs in it). In a former life (back when it was still rfc822!) I remember implementing a similar parser, which I think was in turn based on the cclient code in pine. It's not _too_ hard to get it all right based on the BNF in the RFC, but as you can see it's a bit tedious. And I'm not convinced we actually need it to be completely right for our purposes. We really are looking for a single address, with the email in "<>" and the name as everything before that, but de-quoted. -Peff