Re: [BUG PATCH RFC] mailinfo: correctly handle multiline 'Subject:' header

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Jan 10, 2009 at 05:54:14PM -0800, Junio C Hamano wrote:
> Kirill Smelkov <kirr@xxxxxxxxxxxxxxxxxxx> writes:
> 
> >     [but I'm not sure whether testresult with Nathaniel Borenstein
> >      (םולש ןב ילטפנ) is correct -- see rfc2047-info-0004]
> > ...
> > diff --git a/t/t5100/rfc2047-info-0004 b/t/t5100/rfc2047-info-0004
> > new file mode 100644
> > index 0000000..850f831
> > --- /dev/null
> > +++ b/t/t5100/rfc2047-info-0004
> > @@ -0,0 +1,5 @@
> > +Author: Nathaniel Borenstein  
> > +     ([somethig that could be detected as spam])
> > +Email: nsb@xxxxxxxxxxxxxxxxxxxx
> > +Subject: Test of new header generator
> > +
> 
> That does look wrong.  If you can fix this, please do so; otherwise please
> mark the test that deals with this entry with test_expect_failure, until
> somebody else does.

Yes, I think I've dealt with it -- we weren't unfolding 'From' header,
and we were not skipping comments in rfc822 headers, so:

From: Kirill Smelkov <kirr@xxxxxxxxxxxxxxxxxxx>
Subject: [PATCH] mailinfo: 'From:' header should be unfold as well

At present we do headers unfolding (see RFC822 3.1.1. LONG HEADER FIELDS) for
all fields except 'From' (always) and 'Subject' (when keep_subject is set)

Not unfolding 'From' is a bug -- see above-mentioned RFC link.

Signed-off-by: Kirill Smelkov <kirr@xxxxxxxxxxxxxxxxxxx>

---
 builtin-mailinfo.c  |    1 +
 t/t5100/sample.mbox |    5 ++++-
 2 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/builtin-mailinfo.c b/builtin-mailinfo.c
index f7c8c08..6d72c1b 100644
--- a/builtin-mailinfo.c
+++ b/builtin-mailinfo.c
@@ -860,6 +860,7 @@ static void handle_info(void)
 			}
 			output_header_lines(fout, "Subject", hdr);
 		} else if (!memcmp(header[i], "From", 4)) {
+			cleanup_space(hdr);
 			handle_from(hdr);
 			fprintf(fout, "Author: %s\n", name.buf);
 			fprintf(fout, "Email: %s\n", email.buf);
diff --git a/t/t5100/sample.mbox b/t/t5100/sample.mbox
index 4bf7947..d465685 100644
--- a/t/t5100/sample.mbox
+++ b/t/t5100/sample.mbox
@@ -2,7 +2,10 @@
 	
     
 From nobody Mon Sep 17 00:00:00 2001
-From: A U Thor <a.u.thor@xxxxxxxxxxx>
+From: A
+      U
+      Thor
+      <a.u.thor@xxxxxxxxxxx>
 Date: Fri, 9 Jun 2006 00:44:16 -0700
 Subject: [PATCH] a commit.
 
-- 
tg: (1562445..) t/mail-from-unfold (depends on: master)




From: Kirill Smelkov <kirr@xxxxxxxxxxxxxxxxxxx>
Subject: [PATCH] mailinfo: more smarter removal of rfc822 comments from 'From'

As described in RFC822 (3.4.3 COMMENTS, and  A.1.4.), comments, as e.g.

    John (zzz) Doe <john.doe@xz> (Comment)

should "NOT [be] included in the destination mailbox"

We need this functionality to pass all RFC2047 based tests in the next commit.

Signed-off-by: Kirill Smelkov <kirr@xxxxxxxxxxxxxxxxxxx>

---
 builtin-mailinfo.c  |   30 ++++++++++++++++++++++++++++++
 t/t5100/sample.mbox |    4 ++--
 2 files changed, 32 insertions(+), 2 deletions(-)

diff --git a/builtin-mailinfo.c b/builtin-mailinfo.c
index 6d72c1b..c0b1ab4 100644
--- a/builtin-mailinfo.c
+++ b/builtin-mailinfo.c
@@ -29,6 +29,9 @@ static struct strbuf **p_hdr_data, **s_hdr_data;
 #define MAX_HDR_PARSED 10
 #define MAX_BOUNDARIES 5
 
+static void cleanup_space(struct strbuf *sb);
+
+
 static void get_sane_name(struct strbuf *out, struct strbuf *name, struct strbuf *email)
 {
 	struct strbuf *src = name;
@@ -120,6 +123,33 @@ static void handle_from(const struct strbuf *from)
 		strbuf_setlen(&f, f.len - 1);
 	}
 
+	/* This still could not be finished for emails like
+	 *
+	 *	"John (zzz) Doe <john.doe@xz> (Comment)"
+	 *
+	 * The email part had already been removed, so let's kill comments as
+	 * well -- RFC822 says comments should not be present in destination
+	 * mailbox (3.4.3. Comments  and  A.1.4.)
+	 */
+	while (1) {
+		char *ta;
+
+		at = strchr(f.buf, '(');
+		if (!at)
+			break;
+		ta = strchr(at, ')');
+		if (!ta)
+			break;
+
+		strbuf_remove(&f, at - f.buf, ta-at + (*ta ? 1 : 0));
+	}
+
+	/* and let's finally cleanup spaces that were around (possibly
+	 * internal) comments
+	 */
+	cleanup_space(&f);
+	strbuf_trim(&f);
+
 	get_sane_name(&name, &f, &email);
 	strbuf_release(&f);
 }
diff --git a/t/t5100/sample.mbox b/t/t5100/sample.mbox
index d465685..42e02f3 100644
--- a/t/t5100/sample.mbox
+++ b/t/t5100/sample.mbox
@@ -2,10 +2,10 @@
 	
     
 From nobody Mon Sep 17 00:00:00 2001
-From: A
+From: A (zzz)
       U
       Thor
-      <a.u.thor@xxxxxxxxxxx>
+      <a.u.thor@xxxxxxxxxxx> (Comment)
 Date: Fri, 9 Jun 2006 00:44:16 -0700
 Subject: [PATCH] a commit.
 
-- 
tg: (b798ad9..) t/mail-from-comments (depends on: t/mail-from-unfold)



All these patches + original one (trivially adapted) could be pulled from

    git://repo.or.cz/git/kirr.git  for-junio



Kirill Smelkov (3):
      mailinfo: 'From:' header should be unfold as well
      mailinfo: more smarter removal of rfc822 comments from 'From'
      mailinfo: correctly handle multiline 'Subject:' header


builtin-mailinfo.c           |   58 ++++++++++++++++++++++++++++++++++++------
t/t5100-mailinfo.sh          |   24 ++++++++++++++++-
t/t5100/info0012             |    5 +++
t/t5100/msg0012              |    7 +++++
t/t5100/patch0012            |   30 +++++++++++++++++++++
t/t5100/rfc2047-info-0001    |    4 +++
t/t5100/rfc2047-info-0002    |    4 +++
t/t5100/rfc2047-info-0003    |    4 +++
t/t5100/rfc2047-info-0004    |    4 +++
t/t5100/rfc2047-info-0005    |    2 +
t/t5100/rfc2047-info-0006    |    2 +
t/t5100/rfc2047-info-0007    |    2 +
t/t5100/rfc2047-info-0008    |    2 +
t/t5100/rfc2047-info-0009    |    2 +
t/t5100/rfc2047-info-0010    |    2 +
t/t5100/rfc2047-info-0011    |    2 +
t/t5100/rfc2047-samples.mbox |   48 ++++++++++++++++++++++++++++++++++
t/t5100/sample.mbox          |   57 ++++++++++++++++++++++++++++++++++++++++-
18 files changed, 249 insertions(+), 10 deletions(-)


Thanks,
Kirill
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux