[PATCH 2/3] format-patch: wrap long header lines

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Subject and identity headers may be arbitrarily long. In the
past, we just assumed that single-line headers would be
reasonably short. For multi-line subjects that we squish
into a single line, we just "pre-folded" the data in
pp_title_line by adding a newline and indentation.

There were two problems. One is that, although rare,
single-line messages can actually be longer than the
recommended line-length limits. The second is that the
pre-folding interacted badly with rfc2047 encoding, leading
to malformed headers.

Instead, let's stop pre-folding the subject lines, and just
fold everything based on length in add_rfc2047, whether
it is encoded or not.

Signed-off-by: Jeff King <peff@xxxxxxxx>
---
Three things to note:

  1. We call strbuf_add_wrapped_bytes for the non-encoded case. This
     nicely wraps on word and multi-character boundaries. But it will
     never do a "hard" wrap if there are no word boundaries, and it
     probably should at the 998-character mark which rfc2822 specifies
     as a hard limit.

     I don't know how much we care. For something like that you'd have
     to be maliciously trying to create a bogus patch. If you're mailing
     it, you could just create the bogus mail by hand. If you're trying
     to buffer overflow somebody's "format-patch | am" script, it won't
     do anything, as mailinfo does not have a limit on line length.

  2. For the non-quoted case, technically we want to indent less on the
     first line than we do on subsequent lines (to account for "Subject:
     [PATCH]"). strbuf_add_wrapped_bytes doesn't support that notion. We
     could add it, but it probably doesn't matter. We just end up
     wrapping the subsequent lines a little tighter than we need to.

  3. I used RFC2822's SHOULD value of 78 characters as a line length.
     That's probably unnecessarily conservative. In theory wrapping
     shouldn't make a difference to the data, but maybe people who
     hand-edit the result would prefer not to see wrapping? I dunno. In
     that case, we could set it to something higher like 120, which
     would still wrap the really ridiculous cases.

 pretty.c                |   32 +++++++++++++----
 t/t4014-format-patch.sh |   84 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 108 insertions(+), 8 deletions(-)

diff --git a/pretty.c b/pretty.c
index 8549934..0e167f4 100644
--- a/pretty.c
+++ b/pretty.c
@@ -216,7 +216,15 @@ static int is_rfc2047_special(char ch)
 static void add_rfc2047(struct strbuf *sb, const char *line, int len,
 		       const char *encoding)
 {
-	int i, last;
+	static const int max_length = 78; /* per rfc2822 */
+	int i;
+	int line_len;
+
+	/* How many bytes are already used on the current line? */
+	for (i = sb->len - 1; i >= 0; i--)
+		if (sb->buf[i] == '\n')
+			break;
+	line_len = sb->len - (i+1);
 
 	for (i = 0; i < len; i++) {
 		int ch = line[i];
@@ -225,14 +233,21 @@ static void add_rfc2047(struct strbuf *sb, const char *line, int len,
 		if ((i + 1 < len) && (ch == '=' && line[i+1] == '?'))
 			goto needquote;
 	}
-	strbuf_add(sb, line, len);
+	strbuf_add_wrapped_bytes(sb, line, len, 0, 1, max_length - line_len);
 	return;
 
 needquote:
 	strbuf_grow(sb, len * 3 + strlen(encoding) + 100);
 	strbuf_addf(sb, "=?%s?q?", encoding);
-	for (i = last = 0; i < len; i++) {
+	line_len += strlen(encoding) + 5; /* 5 for =??q? */
+	for (i = 0; i < len; i++) {
 		unsigned ch = line[i] & 0xFF;
+
+		if (line_len >= max_length - 2) {
+			strbuf_addf(sb, "?=\n =?%s?q?", encoding);
+			line_len = strlen(encoding) + 5 + 1; /* =??q? plus SP */
+		}
+
 		/*
 		 * We encode ' ' using '=20' even though rfc2047
 		 * allows using '_' for readability.  Unfortunately,
@@ -240,12 +255,14 @@ needquote:
 		 * leave the underscore in place.
 		 */
 		if (is_rfc2047_special(ch) || ch == ' ') {
-			strbuf_add(sb, line + last, i - last);
 			strbuf_addf(sb, "=%02X", ch);
-			last = i + 1;
+			line_len += 3;
+		}
+		else {
+			strbuf_addch(sb, ch);
+			line_len++;
 		}
 	}
-	strbuf_add(sb, line + last, len - last);
 	strbuf_addstr(sb, "?=");
 }
 
@@ -1106,11 +1123,10 @@ void pp_title_line(enum cmit_fmt fmt,
 		   const char *encoding,
 		   int need_8bit_cte)
 {
-	const char *line_separator = (fmt == CMIT_FMT_EMAIL) ? "\n " : " ";
 	struct strbuf title;
 
 	strbuf_init(&title, 80);
-	*msg_p = format_subject(&title, *msg_p, line_separator);
+	*msg_p = format_subject(&title, *msg_p, " ");
 
 	strbuf_grow(sb, title.len + 1024);
 	if (subject) {
diff --git a/t/t4014-format-patch.sh b/t/t4014-format-patch.sh
index 027c13d..9c66367 100755
--- a/t/t4014-format-patch.sh
+++ b/t/t4014-format-patch.sh
@@ -709,4 +709,88 @@ test_expect_success TTY 'format-patch --stdout paginates' '
 	test_path_is_missing .git/pager_used
 '
 
+test_expect_success 'format-patch handles multi-line subjects' '
+	rm -rf patches/ &&
+	echo content >>file &&
+	for i in one two three; do echo $i; done >msg &&
+	git add file &&
+	git commit -F msg &&
+	git format-patch -o patches -1 &&
+	grep ^Subject: patches/0001-one.patch >actual &&
+	echo "Subject: [PATCH] one two three" >expect &&
+	test_cmp expect actual
+'
+
+test_expect_success 'format-patch handles multi-line encoded subjects' '
+	rm -rf patches/ &&
+	echo content >>file &&
+	for i in en tvà tre; do echo $i; done >msg &&
+	git add file &&
+	git commit -F msg &&
+	git format-patch -o patches -1 &&
+	grep ^Subject: patches/0001-en.patch >actual &&
+	echo "Subject: [PATCH] =?UTF-8?q?en=20tv=C3=A5=20tre?=" >expect &&
+	test_cmp expect actual
+'
+
+M8="foo bar "
+M64=$M8$M8$M8$M8$M8$M8$M8$M8
+M512=$M64$M64$M64$M64$M64$M64$M64$M64
+cat >expect <<'EOF'
+Subject: [PATCH] foo bar foo bar foo bar foo bar foo bar foo bar foo bar foo
+ bar foo bar foo bar foo bar foo bar foo bar foo bar foo bar
+ foo bar foo bar foo bar foo bar foo bar foo bar foo bar foo
+ bar foo bar foo bar foo bar foo bar foo bar foo bar foo bar
+ foo bar foo bar foo bar foo bar foo bar foo bar foo bar foo
+ bar foo bar foo bar foo bar foo bar foo bar foo bar foo bar
+ foo bar foo bar foo bar foo bar foo bar foo bar foo bar foo
+ bar foo bar foo bar foo bar foo bar foo bar foo bar foo bar
+ foo bar foo bar foo bar foo bar
+EOF
+test_expect_success 'format-patch wraps extremely long headers (ascii)' '
+	echo content >>file &&
+	git add file &&
+	git commit -m "$M512" &&
+	git format-patch --stdout -1 >patch &&
+	sed -n "/^Subject/p; /^ /p; /^$/q" <patch >subject &&
+	test_cmp expect subject
+'
+
+M8="fÃÃ bar "
+M64=$M8$M8$M8$M8$M8$M8$M8$M8
+M512=$M64$M64$M64$M64$M64$M64$M64$M64
+cat >expect <<'EOF'
+Subject: [PATCH] =?UTF-8?q?f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6?=
+ =?UTF-8?q?=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6?=
+ =?UTF-8?q?=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6?=
+ =?UTF-8?q?=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6?=
+ =?UTF-8?q?=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6?=
+ =?UTF-8?q?=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6?=
+ =?UTF-8?q?=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6?=
+ =?UTF-8?q?=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6?=
+ =?UTF-8?q?=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6?=
+ =?UTF-8?q?=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6?=
+ =?UTF-8?q?=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6?=
+ =?UTF-8?q?=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6?=
+ =?UTF-8?q?=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6?=
+ =?UTF-8?q?=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6?=
+ =?UTF-8?q?=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6?=
+ =?UTF-8?q?=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6?=
+ =?UTF-8?q?=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6?=
+ =?UTF-8?q?=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6?=
+ =?UTF-8?q?=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6?=
+ =?UTF-8?q?=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6?=
+ =?UTF-8?q?=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6=C3=B6=20bar=20f=C3=B6?=
+ =?UTF-8?q?=C3=B6=20bar=20f=C3=B6=C3=B6=20bar?=
+EOF
+test_expect_success 'format-patch wraps extremely long headers (rfc2047)' '
+	rm -rf patches/ &&
+	echo content >>file &&
+	git add file &&
+	git commit -m "$M512" &&
+	git format-patch --stdout -1 >patch &&
+	sed -n "/^Subject/p; /^ /p; /^$/q" <patch >subject &&
+	test_cmp expect subject
+'
+
 test_done
-- 
1.7.2.5.15.gfdd1c

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]