Re: `git-send-email' doesn't specify `Content-Type'

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Nov 10, 2007 at 01:51:26PM +0100, Björn Steinbrink wrote:

> On 2007.11.10 04:35:05 -0800, Brian Swetland wrote:
> > The first line of the patch is a From: field with Arve's name, in
> > an (rfc822?) encoded format):
> > From: =?utf-8?q?Arve=20Hj=C3=B8nnev=C3=A5g?= <arve@xxxxxxxxxxx>

It's rfc2047 (and you can grep for that in git-send-email).

> Ah! Commit author differs from mail sender, didn't think of that. That's
> probably the same problem as with the -s option, ie. that git-send-email
> only looks at the existing text and not add anything it adds itself when
> checking the encoding. Sorry for the noise.

It's not the same problem; the '-s' problem was git-format-patch, and
this is git-send-email. In fact, git-format-patch correctly notes the
encoding in the header. It is git-send-email in this case that takes the
encoded and properly marked header, deciphers it, throws away the
original encoding, and sticks it into the message body without
considering the encoding of the body.

So I think you would want to:
  1. remember the encoding pulled from the rfc2047 header
  2. When prepending the author line to the message, consider the
     body encoding.
  2a. If no encoding, then the body is US-ASCII and we can presumably
      just add
         MIME-Version: 1.0
         Content-Type: text/plain; charset=$enc
  2b. If there is an encoding, we need to Iconv from the name
      encoding to the body encoding.

However, as it stands now, our rfc2047 unquoting _always_ assumes that
we are in utf-8 for the name (which is probably true if the messages
came out of git-format-patch with default-ish settings). So the easy,
hackish way is probably to just add the MIME-Version and 'Content-type:
text/plain; charset=utf-8' headers if we unquoted the author field.

If we want to accept arbitrary messages, below is a patch to at least
have unquote_rfc2047 return the right information (and then on
git-send-email.perl:758, where we prepend $author, the encoding would
need to be taken into account as I described above).

Given that git-send-email is already pretty dependent on
git-format-patch output (and nobody has been complaining about its
rfc2047 handling so far!) the easy, hackish way is probably the best.

-Peff

---
diff --git a/git-send-email.perl b/git-send-email.perl
index f9bd2e5..4f8297f 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -514,11 +514,13 @@ $time = time - scalar $#files;
 
 sub unquote_rfc2047 {
 	local ($_) = @_;
-	if (s/=\?utf-8\?q\?(.*)\?=/$1/g) {
+	my $encoding;
+	if (s/=\?([^?])+\?q\?(.*)\?=/$2/g) {
+		$encoding = $1;
 		s/_/ /g;
 		s/=([0-9A-F]{2})/chr(hex($1))/eg;
 	}
-	return "$_";
+	return "$_", $encoding;
 }
 
 # use the simplest quoting being able to handle the recipient
@@ -667,6 +669,7 @@ foreach my $t (@files) {
 	open(F,"<",$t) or die "can't open file $t";
 
 	my $author = undef;
+	my $author_encoding;
 	@cc = @initial_cc;
 	@xh = ();
 	my $input_format = undef;
@@ -692,7 +695,8 @@ foreach my $t (@files) {
 						next if ($suppress_from);
 					}
 					elsif ($1 eq 'From') {
-						$author = unquote_rfc2047($2);
+						($author, $author_encoding)
+						  = unquote_rfc2047($2);
 					}
 					printf("(mbox) Adding cc: %s from line '%s'\n",
 						$2, $_) unless $quiet;
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux