Jeff King <peff@xxxxxxxx> writes: > On Thu, Mar 13, 2008 at 05:40:19PM +0100, Samuel Tardieu wrote: > >> Add MIME-Version/Content-Type/Content-Transfer-Encoding headers in >> messages generated with git-format-patch. Without it, messages generated >> without using --attach or --inline didn't have any content type information. >> >> I got hit with this problem yesterday when sending a patch to linux-kernel >> with a commit message containing the name "Pádraig" in it. Moreover, >> the mailing-list software added an incorrect ISO-8859-1 encoding information >> which mangled Pádraig's name. > > It's supposed to handle this automatically if the commit message > contains non-ascii characters. What version of git were you using? You are right. The call-chain looks like this: log_tree_diff_flush() show_log() log_write_email_headers() writes mbox From pretty_print_commit() check commit log if it is pure ascii pp_header() pp_user_info() writes RFC2822 From: pp_title_line() writes RFC2822 Subject: writes MIME-Version: and friends if needed pp_remainder() writes the remainder of the log message append_signoff() printf("---\n") diff_flush() writes the patch At the beginning of pretty_print_commit() we look at the log and if it is not ascii we pass that information down to pp_title_line() which is responsible for writing MIME header at the appropriate place. If your patch itself has some non-ASCII material, and if your commit log message is pure ASCII, the above would end up not writing MIME at all. If your commit log message is non ASCII, then we will mark it as if the entire message is in the encoding of the log in pp_title_line(). This might look like a problem, but it is not something non multipart output of format-patch should even try to cater to. The payload (i.e. the patch) out of git has always been uninterpreted sequence of bytes (and it is not going to change). A patch to i18n po/ files for example could contain patches to different files encoded in KOI-8, BIG5, EUC-JP and UTF-8 at the same time. There is no way to say "text/plain; charset=X" for such a payload (because there is no single charset used in such a patch), and git simply does not know nor care about what encoding each file is in. The output from git marks only the part git knows the encoding about (i.e. the commit log message). Having said all that, I notice that addition of format.headers variable (which I think is a later invention) was done not quite correctly. In the callchain above, pretty_print_commit() function checks the commit log but it is meant to do so only when we haven't emitted MIME Content-Type: (because the user told us to do multipart), and "after_subject" parameter was getting passed around for it (and its callees) to detect exactly that. But format.headers misused that variable to carry its contents along --- there needs a way to pass "have we said MIME-Version crap already" separately. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html