"git mailinfo" (hence "git am") understands some well-known headers, like "Subject: ", "Date: " and "From: ", placed at the beginning of the message body (and the "--scissors" can discard the part of the body before a scissors-mark). However, some people throw other kinds of header-looking things there, expecting them to be discarded. Finding and discarding anything that looks like RFC2822 header is not a right solution. The body of the message may start with a line that begins with a word followed by a colon that is a legitimate part of the message that should not be discarded. Instead, keep reading non-blank lines once we see an in-body header at the beginning and discard them. Nobody will be insane enough to reorder the headers to read like this: Garbage-non-in-body-header: here Subject: in-body subject Here is the body of the commit log. but it is common for lazy or misguided people to leave non-header materials in-body like this: From: Junio C Hamano <gitster@xxxxxxxxx> Date: Mon, 28 Sep 2015 19:19:27 -0700 Subject: [PATCH] Git 2.6.1 MIME-Version: 1.0 Signed-off-by: Junio C Hamano <gitster@xxxxxxxxx> --- I think it is wrong for the in-body header codepath to pay attention to content-transfer-encodings and stuff, but that is a separate issue. Also if you remove the "does the line even look like a header?" check, some tests in t5100 starts failing. E.g. From nobody Mon Sep 17 00:00:00 2001 From: A U Thor <a.u.thor@xxxxxxxxxxx> Subject: check bogus body header (from) Date: Fri, 9 Jun 2006 00:44:16 -0700 From: bogosity - a list - of stuff wants to make sure the list of two bulletted-items are in the commit log, and the in-body From: line gets used. So I dunno. I am not entirely convinced that this is a good change. builtin/mailinfo.c | 34 +++++++++++++++++++++++++++++++--- t/t5100-mailinfo.sh | 3 ++- t/t5100/info0018 | 5 +++++ t/t5100/msg0018 | 2 ++ t/t5100/patch0018 | 6 ++++++ t/t5100/sample.mbox | 18 ++++++++++++++++++ 6 files changed, 64 insertions(+), 4 deletions(-) create mode 100644 t/t5100/info0018 create mode 100644 t/t5100/msg0018 create mode 100644 t/t5100/patch0018 diff --git a/builtin/mailinfo.c b/builtin/mailinfo.c index 999a525..169ee54 100644 --- a/builtin/mailinfo.c +++ b/builtin/mailinfo.c @@ -787,18 +787,46 @@ static int is_scissors_line(const struct strbuf *line) static int handle_commit_msg(struct strbuf *line) { + /* + * Are we still scanning and discarding in-body headers? + * It is initially set to 1, set to 2 when we do see a + * valid in-body header. + */ static int still_looking = 1; + int is_empty_line; if (!cmitmsg) return 0; - if (still_looking) { - if (!line->len || (line->len == 1 && line->buf[0] == '\n')) + is_empty_line = (!line->len || (line->len == 1 && line->buf[0] == '\n')); + if (still_looking == 1) { + /* + * Haven't seen a known in-body header; discard an empty line. + */ + if (is_empty_line) return 0; } if (use_inbody_headers && still_looking) { - still_looking = check_header(line, s_hdr_data, 0); + int is_known_header = check_header(line, s_hdr_data, 0); + + if (still_looking == 2) { + /* + * an empty line after the in-body header block, + * or a line obviously not an attempt to invent + * an unsupported in-body header. + */ + if (is_empty_line || !is_rfc2822_header(line)) + still_looking = 0; + if (is_empty_line) + return 0; + /* otherwise do not discard the line, but keep going */ + } else if (is_known_header) { + still_looking = 2; + } else if (still_looking != 2) { + still_looking = 0; + } + if (still_looking) return 0; } else diff --git a/t/t5100-mailinfo.sh b/t/t5100-mailinfo.sh index e97cfb2..3ce041b 100755 --- a/t/t5100-mailinfo.sh +++ b/t/t5100-mailinfo.sh @@ -11,7 +11,8 @@ test_expect_success 'split sample box' \ 'git mailsplit -o. "$TEST_DIRECTORY"/t5100/sample.mbox >last && last=`cat last` && echo total is $last && - test `cat last` = 17' + test `cat last` = 18 +' check_mailinfo () { mail=$1 opt=$2 diff --git a/t/t5100/info0018 b/t/t5100/info0018 new file mode 100644 index 0000000..ec671fc --- /dev/null +++ b/t/t5100/info0018 @@ -0,0 +1,5 @@ +Author: A U Thor +Email: a.u.thor@xxxxxxxxxxx +Subject: A E I O U +Date: Mon, 17 Sep 2012 14:23:49 -0700 + diff --git a/t/t5100/msg0018 b/t/t5100/msg0018 new file mode 100644 index 0000000..2ee0900 --- /dev/null +++ b/t/t5100/msg0018 @@ -0,0 +1,2 @@ +New content here + diff --git a/t/t5100/patch0018 b/t/t5100/patch0018 new file mode 100644 index 0000000..35cf84c --- /dev/null +++ b/t/t5100/patch0018 @@ -0,0 +1,6 @@ +diff --git a/foo b/foo +index e69de29..d95f3ad 100644 +--- a/foo ++++ b/foo +@@ -0,0 +1 @@ ++New content diff --git a/t/t5100/sample.mbox b/t/t5100/sample.mbox index 8b2ae06..d7c5878 100644 --- a/t/t5100/sample.mbox +++ b/t/t5100/sample.mbox @@ -406,6 +406,7 @@ Subject: re: [PATCH] another patch From: A U Thor <a.u.thor@xxxxxxxxxxx> Subject: [PATCH] another patch + >Here is an empty patch from A U Thor. Hey you forgot the patch! @@ -699,3 +700,20 @@ index e69de29..d95f3ad 100644 +++ b/foo @@ -0,0 +1 @@ +New content +From nobody Mon Sep 17 00:00:00 2001 +From: A U Thor <a.u.thor@xxxxxxxxxxx> +Subject: Re: some discussion title +Date: Mon, 17 Sep 2012 14:23:49 -0700 + +Subject: A E I O U +MIME-VERSION: 1.0 +Garbage: Not a valid in-body header + +New content here + +diff --git a/foo b/foo +index e69de29..d95f3ad 100644 +--- a/foo ++++ b/foo +@@ -0,0 +1 @@ ++New content -- 2.6.1-296-ge15092e -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html