Re: a bug about format-patch of multibyte characters comment

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Feb 12, 2011 at 07:13:15PM +0900, xiaozhu wrote:

> I commit a fix to my repository with comment like following:
> -----------------------------------------------------
> XXXXXXXXXXXX
> YYYYYY
> -----------------------------------------------------
> 
> two lines of multibyte language comment.

OK.

> then I use format-patch to export this fix, I get a patch file like following:
> 
> ------------------------------------------------------------------------------
> From d3532c3263a02a2367a3aa5c9cc3f0bd738b79b1 Mon Sep 17 00:00:00 2001
> From: xz <xz>
> Date: Fri, 11 Feb 2011 21:30:35 +0900
> Subject: [PATCH] =?UTF-8?q?=E6=97=A5=E6=9C=AC=E8=AA=9E=E3=81=8C=E5=A4=A7=E4=B8=88=E5=A4=AB
> =20=E6=94=B9=E8=A1=8C=E3=81=99=E3=82=8B?=

Yeah, this is wrong. There should be a whitespace indentation in a
multi-line header, or the whole thing should be on one line. The newline
in your commit subject is apparently leaking through, and it should be
qp-encoded.

> const char *format_subject(struct strbuf *sb, const char *msg,
> 			   const char *line_separator)
> {
> 	int first = 1;
> 
> 	for (;;) {
> 		const char *line = msg;
> 		int linelen = get_one_line(line);
> 
> 		msg += linelen;
> 		
> 		if (!linelen || is_empty_line(line, &linelen))
> 			break;
> 
> 		if (!sb)cat
> 			continue;
> 		strbuf_grow(sb, linelen + 2);
> 		if (!first)
> 			strbuf_addstr(sb, line_separator);
> 		strbuf_add(sb, line, linelen);
> 		first = 0;
> 	}
> 	return msg;
> }
> 
> At first I want to know: Does this function means that always add the first line
> of comment to the argument sb, then return the rest? Is there any other thing that I
> didn't considered?

No. It adds all lines in the first paragraph into sb. Which, if you
follow the usual git convention, is a single line. But we also see
commits imported from other version control systems where that
convention is not as prevalent. In practice, using the whole first
paragraph seems to make the most sense as a subject.

> At last, if what I think is correct, I plan to fix it as following:
> 
> const char *format_subject(struct strbuf *sb, const char *msg,
> 			   const char *line_separator)
> {
> 	int first = 1;
> 
> 	//for (;;) {
> 		const char *line = msg;
> 		int linelen = get_one_line(line);
> 
> 		msg += linelen;
> 		
> 		if (!linelen || is_empty_line(line, &linelen)) return msg;
> 			//break;
> 
> 		if (!sb) return msg;
> 			//continue;
> 		strbuf_grow(sb, linelen + 2);
> 		if (!first)
> 			strbuf_addstr(sb, line_separator);
> 		strbuf_add(sb, line, linelen);
> 		first = 0;
> 	//}
> 	return msg;
> }

No, that's not right. It breaks the use-whole-paragraph feature. The
right fix is to encode the embedded newline in the subject properly.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]