On Mon, Feb 20, 2012 at 08:55:28PM +0700, Nguyen Thai Ngoc Duy wrote: > > Author-ident is typically utf-8 already, so you cannot assume "ASCII". > > I wonder if anyone puts non utf-8 strings in there, or could we > enforce utf-8 (i.e. validate and reject non utf-8 strings) and accept > encoded word syntax (rfc 2047) with the help of the new > $GIT_IDENT_ENCODING variable. The "accept ..." part can wait until > someone is hit by "utf-8 only" check and steps up. I was just having a similar discussion with libgit2 folks, who were wondering if there would ever be non-utf8 in there. When we call "reencode_commit_message", it looks like we do the whole object. In other words, your author name _must_ match any encoding you specify in the "encoding" header. I.e., if you do: # latin1 é e=`printf '\xe9'` export GIT_AUTHOR_NAME="P${e}ff King" git init git config i18n.commitencoding iso8859-1 touch foo && git add foo && git commit --allow-empty -m "more latin1 ${e}ncoding" both the name and the message should show fine on your utf8 terminal if you do this: git config i18n.logoutputencoding utf8 git show And similarly, we do the right thing in format-patch, both with and without logoutputencoding set: $ git format-patch --root --stdout | grep -Ei "^(from|subject):" From: =?iso8859-1?q?P=E9ff=20King?= <peff@xxxxxxxx> Subject: [PATCH] =?iso8859-1?q?more=20latin1=20=E9ncoding?= $ git config i18n.logoutputencoding utf8 $ git format-patch --root --stdout | grep -Ei "^(from|subject):" From: =?utf8?q?P=C3=A9ff=20King?= <peff@xxxxxxxx> Subject: [PATCH] =?utf8?q?more=20latin1=20=C3=A9ncoding?= (where 0xc3a9 is the utf8 equivalent of latin1 0xe9). So I have no idea if people are using it or not, but it is actually usable. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html