On Wed, Jul 04, 2012 at 03:19:31PM +0900, Katsuyama Takeharu wrote: > diff --git a/pretty.c b/pretty.c > --- a/pretty.c > +++ b/pretty.c > @@ -272,6 +272,13 @@ static void add_rfc2047(struct strbuf *sb, const char > *line, int len, > static const int max_length = 78; /* per rfc2822 */ > int i; > int line_len; > + int utf8_ctr, use_utf8; > + const char *utf8_start; > + > + if (is_encoding_utf8(encoding) && encoding != NULL) > + use_utf8 = 1; > + else > + use_utf8 = 0; I think you can drop the "encoding != NULL" here. If we don't have an explicit encoding, git always assumes utf8 (also, as it happens we never hit this point with a NULL encoding in the current code anyway, though that could in theory change in the future). > > Can we re-use utf8_width here instead of rewriting these rules? > > Yes you can. But there are an issue which utf8_width seems not to return > correct value. It returns 3 even if a provided code has 3 octet utf-8 > char(e.g. 0xE38292). > I expect it returns 2. Hmm. I think I may have led you astray. It seems that the return value of utf8_width is not about the byte-width of the character representation, but rather about the intended character-width of the glyph. But since we are encoding the bytes, we care about the former. So I think you would really want to use pick_one_utf8_char and see how many characters it consumed, like this: const char *p = &line[i]; pick_one_utf8_char(&p, NULL); if (!p) /* not valid utf8, just assume single byte */ utf8_ctr = 1; else utf8_ctr = p - &line[i]; -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html