Re: [RFC/PATCH] mailinfo: do not treat ">From" lines as in-body headers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Junio C Hamano <gitster@xxxxxxxxx> writes:

>> I think you forgot to "git add" mbox.h. That being said, if we did go
>> this route, I do not see any reason to share the code at all. This can
>> be purely a mailinfo.c thing.
>
> OK.  A reroll coming today when I find time.

-- >8 --
From: Jeff King <peff@xxxxxxxx>
Date: Sat, 13 Sep 2014 21:30:38 -0400
Subject: [PATCH] mailinfo: make ">From" in-body header check more robust

Since commit 81c5cf7 (mailinfo: skip bogus UNIX From line inside
body, 2006-05-21), we have treated lines like ">From" in the body as
headers. This makes "git am" work for people who erroneously paste
the whole output from format-patch:

  From 12345abcd...fedcba543210 Mon Sep 17 00:00:00 2001
  From: them
  Subject: [PATCH] whatever

into their email body (assuming that an mbox writer then quotes
"From" as ">From", as otherwise we would actually mailsplit on the
in-body line).

However, this has false positives if somebody actually has a commit
body that starts with "From "; in this case we erroneously remove
the line entirely from the commit message. We can make this check
more robust by making sure the line actually looks like a real mbox
"From" line.

Inspect the line that begins with ">From " a more carefully to only
skip lines that match the expected pattern (note that the datestamp
part of the format-patch output is designed to be kept constant to
help those who write magic(5) entries).

Signed-off-by: Jeff King <peff@xxxxxxxx>
Signed-off-by: Junio C Hamano <gitster@xxxxxxxxx>
---
 builtin/mailinfo.c         | 17 ++++++++++++++++-
 t/t5100-mailinfo.sh        | 18 ++++++++++++++++++
 t/t5100/embed-from.expect  |  5 +++++
 t/t5100/embed-from.in      | 13 +++++++++++++
 t/t5100/quoted-from.expect |  3 +++
 t/t5100/quoted-from.in     | 10 ++++++++++
 6 files changed, 65 insertions(+), 1 deletion(-)
 create mode 100644 t/t5100/embed-from.expect
 create mode 100644 t/t5100/embed-from.in
 create mode 100644 t/t5100/quoted-from.expect
 create mode 100644 t/t5100/quoted-from.in

diff --git a/builtin/mailinfo.c b/builtin/mailinfo.c
index cf11c8d..2632fb0 100644
--- a/builtin/mailinfo.c
+++ b/builtin/mailinfo.c
@@ -288,6 +288,21 @@ static inline int cmp_header(const struct strbuf *line, const char *hdr)
 			line->buf[len] == ':' && isspace(line->buf[len + 1]);
 }
 
+#define SAMPLE "From e6807f3efca28b30decfecb1732a56c7db1137ee Mon Sep 17 00:00:00 2001\n"
+static int is_format_patch_separator(const char *line, int len)
+{
+	const char *cp;
+
+	if (len != strlen(SAMPLE))
+		return 0;
+	if (!skip_prefix(line, "From ", &cp))
+		return 0;
+	if (strspn(cp, "0123456789abcdef") != 40)
+		return 0;
+	cp += 40;
+	return !memcmp(SAMPLE + (cp - line), cp, strlen(SAMPLE) - (cp - line));
+}
+
 static int check_header(const struct strbuf *line,
 				struct strbuf *hdr_data[], int overwrite)
 {
@@ -329,7 +344,7 @@ static int check_header(const struct strbuf *line,
 
 	/* for inbody stuff */
 	if (starts_with(line->buf, ">From") && isspace(line->buf[5])) {
-		ret = 1; /* Should this return 0? */
+		ret = is_format_patch_separator(line->buf + 1, line->len - 1);
 		goto check_header_out;
 	}
 	if (starts_with(line->buf, "[PATCH]") && isspace(line->buf[7])) {
diff --git a/t/t5100-mailinfo.sh b/t/t5100-mailinfo.sh
index 3e64a7a..9e1ad1c 100755
--- a/t/t5100-mailinfo.sh
+++ b/t/t5100-mailinfo.sh
@@ -89,4 +89,22 @@ test_expect_success 'mailinfo on from header without name works' '
 
 '
 
+test_expect_success 'mailinfo finds headers after embedded From line' '
+	mkdir embed-from &&
+	git mailsplit -oembed-from "$TEST_DIRECTORY"/t5100/embed-from.in &&
+	test_cmp "$TEST_DIRECTORY"/t5100/embed-from.in embed-from/0001 &&
+	git mailinfo embed-from/msg embed-from/patch \
+	  <embed-from/0001 >embed-from/out &&
+	test_cmp "$TEST_DIRECTORY"/t5100/embed-from.expect embed-from/out
+'
+
+test_expect_success 'mailinfo on message with quoted >From' '
+	mkdir quoted-from &&
+	git mailsplit -oquoted-from "$TEST_DIRECTORY"/t5100/quoted-from.in &&
+	test_cmp "$TEST_DIRECTORY"/t5100/quoted-from.in quoted-from/0001 &&
+	git mailinfo quoted-from/msg quoted-from/patch \
+	  <quoted-from/0001 >quoted-from/out &&
+	test_cmp "$TEST_DIRECTORY"/t5100/quoted-from.expect quoted-from/msg
+'
+
 test_done
diff --git a/t/t5100/embed-from.expect b/t/t5100/embed-from.expect
new file mode 100644
index 0000000..06a3a38
--- /dev/null
+++ b/t/t5100/embed-from.expect
@@ -0,0 +1,5 @@
+Author: Commit Author
+Email: commit@xxxxxxxxxxx
+Subject: patch subject
+Date: Sat, 13 Sep 2014 21:13:23 -0400 
+
diff --git a/t/t5100/embed-from.in b/t/t5100/embed-from.in
new file mode 100644
index 0000000..5f3f84e
--- /dev/null
+++ b/t/t5100/embed-from.in
@@ -0,0 +1,13 @@
+From 1234567890123456789012345678901234567890 Mon Sep 17 00:00:00 2001
+From: Email Author <email@xxxxxxxxxxx>
+Date: Sun, 25 May 2008 00:38:18 -0700
+Subject: [PATCH] email subject
+
+>From 1234567890123456789012345678901234567890 Mon Sep 17 00:00:00 2001
+From: Commit Author <commit@xxxxxxxxxxx>
+Date: Sat, 13 Sep 2014 21:13:23 -0400
+Subject: patch subject
+
+patch body
+---
+patch
diff --git a/t/t5100/quoted-from.expect b/t/t5100/quoted-from.expect
new file mode 100644
index 0000000..8c9d48c
--- /dev/null
+++ b/t/t5100/quoted-from.expect
@@ -0,0 +1,3 @@
+>From the depths of history, we are stuck with the
+flaky mbox format.
+
diff --git a/t/t5100/quoted-from.in b/t/t5100/quoted-from.in
new file mode 100644
index 0000000..847e1c4
--- /dev/null
+++ b/t/t5100/quoted-from.in
@@ -0,0 +1,10 @@
+From 1234567890123456789012345678901234567890 Mon Sep 17 00:00:00 2001
+From: Author Name <somebody@xxxxxxxxxxx>
+Date: Sun, 25 May 2008 00:38:18 -0700
+Subject: [PATCH] testing quoted >From
+
+>From the depths of history, we are stuck with the
+flaky mbox format.
+
+---
+patch
-- 
2.1.0-420-g23b5121

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]