On Sat, Nov 10, 2007 at 01:51:26PM +0100, Björn Steinbrink wrote: > On 2007.11.10 04:35:05 -0800, Brian Swetland wrote: > > The first line of the patch is a From: field with Arve's name, in > > an (rfc822?) encoded format): > > From: =?utf-8?q?Arve=20Hj=C3=B8nnev=C3=A5g?= <arve@xxxxxxxxxxx> It's rfc2047 (and you can grep for that in git-send-email). > Ah! Commit author differs from mail sender, didn't think of that. That's > probably the same problem as with the -s option, ie. that git-send-email > only looks at the existing text and not add anything it adds itself when > checking the encoding. Sorry for the noise. It's not the same problem; the '-s' problem was git-format-patch, and this is git-send-email. In fact, git-format-patch correctly notes the encoding in the header. It is git-send-email in this case that takes the encoded and properly marked header, deciphers it, throws away the original encoding, and sticks it into the message body without considering the encoding of the body. So I think you would want to: 1. remember the encoding pulled from the rfc2047 header 2. When prepending the author line to the message, consider the body encoding. 2a. If no encoding, then the body is US-ASCII and we can presumably just add MIME-Version: 1.0 Content-Type: text/plain; charset=$enc 2b. If there is an encoding, we need to Iconv from the name encoding to the body encoding. However, as it stands now, our rfc2047 unquoting _always_ assumes that we are in utf-8 for the name (which is probably true if the messages came out of git-format-patch with default-ish settings). So the easy, hackish way is probably to just add the MIME-Version and 'Content-type: text/plain; charset=utf-8' headers if we unquoted the author field. If we want to accept arbitrary messages, below is a patch to at least have unquote_rfc2047 return the right information (and then on git-send-email.perl:758, where we prepend $author, the encoding would need to be taken into account as I described above). Given that git-send-email is already pretty dependent on git-format-patch output (and nobody has been complaining about its rfc2047 handling so far!) the easy, hackish way is probably the best. -Peff --- diff --git a/git-send-email.perl b/git-send-email.perl index f9bd2e5..4f8297f 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -514,11 +514,13 @@ $time = time - scalar $#files; sub unquote_rfc2047 { local ($_) = @_; - if (s/=\?utf-8\?q\?(.*)\?=/$1/g) { + my $encoding; + if (s/=\?([^?])+\?q\?(.*)\?=/$2/g) { + $encoding = $1; s/_/ /g; s/=([0-9A-F]{2})/chr(hex($1))/eg; } - return "$_"; + return "$_", $encoding; } # use the simplest quoting being able to handle the recipient @@ -667,6 +669,7 @@ foreach my $t (@files) { open(F,"<",$t) or die "can't open file $t"; my $author = undef; + my $author_encoding; @cc = @initial_cc; @xh = (); my $input_format = undef; @@ -692,7 +695,8 @@ foreach my $t (@files) { next if ($suppress_from); } elsif ($1 eq 'From') { - $author = unquote_rfc2047($2); + ($author, $author_encoding) + = unquote_rfc2047($2); } printf("(mbox) Adding cc: %s from line '%s'\n", $2, $_) unless $quiet; - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html