On Sat, Jan 10, 2009 at 05:54:14PM -0800, Junio C Hamano wrote: > Kirill Smelkov <kirr@xxxxxxxxxxxxxxxxxxx> writes: > > > [but I'm not sure whether testresult with Nathaniel Borenstein > > (םולש ןב ילטפנ) is correct -- see rfc2047-info-0004] > > ... > > diff --git a/t/t5100/rfc2047-info-0004 b/t/t5100/rfc2047-info-0004 > > new file mode 100644 > > index 0000000..850f831 > > --- /dev/null > > +++ b/t/t5100/rfc2047-info-0004 > > @@ -0,0 +1,5 @@ > > +Author: Nathaniel Borenstein > > + ([somethig that could be detected as spam]) > > +Email: nsb@xxxxxxxxxxxxxxxxxxxx > > +Subject: Test of new header generator > > + > > That does look wrong. If you can fix this, please do so; otherwise please > mark the test that deals with this entry with test_expect_failure, until > somebody else does. Yes, I think I've dealt with it -- we weren't unfolding 'From' header, and we were not skipping comments in rfc822 headers, so: From: Kirill Smelkov <kirr@xxxxxxxxxxxxxxxxxxx> Subject: [PATCH] mailinfo: 'From:' header should be unfold as well At present we do headers unfolding (see RFC822 3.1.1. LONG HEADER FIELDS) for all fields except 'From' (always) and 'Subject' (when keep_subject is set) Not unfolding 'From' is a bug -- see above-mentioned RFC link. Signed-off-by: Kirill Smelkov <kirr@xxxxxxxxxxxxxxxxxxx> --- builtin-mailinfo.c | 1 + t/t5100/sample.mbox | 5 ++++- 2 files changed, 5 insertions(+), 1 deletions(-) diff --git a/builtin-mailinfo.c b/builtin-mailinfo.c index f7c8c08..6d72c1b 100644 --- a/builtin-mailinfo.c +++ b/builtin-mailinfo.c @@ -860,6 +860,7 @@ static void handle_info(void) } output_header_lines(fout, "Subject", hdr); } else if (!memcmp(header[i], "From", 4)) { + cleanup_space(hdr); handle_from(hdr); fprintf(fout, "Author: %s\n", name.buf); fprintf(fout, "Email: %s\n", email.buf); diff --git a/t/t5100/sample.mbox b/t/t5100/sample.mbox index 4bf7947..d465685 100644 --- a/t/t5100/sample.mbox +++ b/t/t5100/sample.mbox @@ -2,7 +2,10 @@ From nobody Mon Sep 17 00:00:00 2001 -From: A U Thor <a.u.thor@xxxxxxxxxxx> +From: A + U + Thor + <a.u.thor@xxxxxxxxxxx> Date: Fri, 9 Jun 2006 00:44:16 -0700 Subject: [PATCH] a commit. -- tg: (1562445..) t/mail-from-unfold (depends on: master) From: Kirill Smelkov <kirr@xxxxxxxxxxxxxxxxxxx> Subject: [PATCH] mailinfo: more smarter removal of rfc822 comments from 'From' As described in RFC822 (3.4.3 COMMENTS, and A.1.4.), comments, as e.g. John (zzz) Doe <john.doe@xz> (Comment) should "NOT [be] included in the destination mailbox" We need this functionality to pass all RFC2047 based tests in the next commit. Signed-off-by: Kirill Smelkov <kirr@xxxxxxxxxxxxxxxxxxx> --- builtin-mailinfo.c | 30 ++++++++++++++++++++++++++++++ t/t5100/sample.mbox | 4 ++-- 2 files changed, 32 insertions(+), 2 deletions(-) diff --git a/builtin-mailinfo.c b/builtin-mailinfo.c index 6d72c1b..c0b1ab4 100644 --- a/builtin-mailinfo.c +++ b/builtin-mailinfo.c @@ -29,6 +29,9 @@ static struct strbuf **p_hdr_data, **s_hdr_data; #define MAX_HDR_PARSED 10 #define MAX_BOUNDARIES 5 +static void cleanup_space(struct strbuf *sb); + + static void get_sane_name(struct strbuf *out, struct strbuf *name, struct strbuf *email) { struct strbuf *src = name; @@ -120,6 +123,33 @@ static void handle_from(const struct strbuf *from) strbuf_setlen(&f, f.len - 1); } + /* This still could not be finished for emails like + * + * "John (zzz) Doe <john.doe@xz> (Comment)" + * + * The email part had already been removed, so let's kill comments as + * well -- RFC822 says comments should not be present in destination + * mailbox (3.4.3. Comments and A.1.4.) + */ + while (1) { + char *ta; + + at = strchr(f.buf, '('); + if (!at) + break; + ta = strchr(at, ')'); + if (!ta) + break; + + strbuf_remove(&f, at - f.buf, ta-at + (*ta ? 1 : 0)); + } + + /* and let's finally cleanup spaces that were around (possibly + * internal) comments + */ + cleanup_space(&f); + strbuf_trim(&f); + get_sane_name(&name, &f, &email); strbuf_release(&f); } diff --git a/t/t5100/sample.mbox b/t/t5100/sample.mbox index d465685..42e02f3 100644 --- a/t/t5100/sample.mbox +++ b/t/t5100/sample.mbox @@ -2,10 +2,10 @@ From nobody Mon Sep 17 00:00:00 2001 -From: A +From: A (zzz) U Thor - <a.u.thor@xxxxxxxxxxx> + <a.u.thor@xxxxxxxxxxx> (Comment) Date: Fri, 9 Jun 2006 00:44:16 -0700 Subject: [PATCH] a commit. -- tg: (b798ad9..) t/mail-from-comments (depends on: t/mail-from-unfold) All these patches + original one (trivially adapted) could be pulled from git://repo.or.cz/git/kirr.git for-junio Kirill Smelkov (3): mailinfo: 'From:' header should be unfold as well mailinfo: more smarter removal of rfc822 comments from 'From' mailinfo: correctly handle multiline 'Subject:' header builtin-mailinfo.c | 58 ++++++++++++++++++++++++++++++++++++------ t/t5100-mailinfo.sh | 24 ++++++++++++++++- t/t5100/info0012 | 5 +++ t/t5100/msg0012 | 7 +++++ t/t5100/patch0012 | 30 +++++++++++++++++++++ t/t5100/rfc2047-info-0001 | 4 +++ t/t5100/rfc2047-info-0002 | 4 +++ t/t5100/rfc2047-info-0003 | 4 +++ t/t5100/rfc2047-info-0004 | 4 +++ t/t5100/rfc2047-info-0005 | 2 + t/t5100/rfc2047-info-0006 | 2 + t/t5100/rfc2047-info-0007 | 2 + t/t5100/rfc2047-info-0008 | 2 + t/t5100/rfc2047-info-0009 | 2 + t/t5100/rfc2047-info-0010 | 2 + t/t5100/rfc2047-info-0011 | 2 + t/t5100/rfc2047-samples.mbox | 48 ++++++++++++++++++++++++++++++++++ t/t5100/sample.mbox | 57 ++++++++++++++++++++++++++++++++++++++++- 18 files changed, 249 insertions(+), 10 deletions(-) Thanks, Kirill -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html