Re: [PATCH 2/2] send-email: rfc2047-quote subject lines with non-ascii characters

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Mar 26, 2008 at 07:59:48AM +0200, Teemu Likonen wrote:

> These patches seem to work except that the quoting of Subject field 
> works only if user types a non-Ascii text to the "What subject should 
> the initial email start with?" prompt. If she changes the subject in 
> editor it won't be rfc2047-quoted.

Ah, yes, I hadn't considered that. We should definitely do the quoting
after all of the user's input. Replace 2/2 from my series with the patch
below, which handles this case correctly (and as a bonus, the user sees
the unencoded subject in the editor, which is much more readable).

> Thank you anyway, I think we're going to right direction. I think 'git 
> send-mail --compose' is nice way to produce introductory message to 
> patch series. If --compose doesn't support MIME encoding reasonable 
> way, user may have to write and send intro message with real MUA and 
> find out the Message-Id for correct In-Reply-To field for the actual 
> patch series.

git-format-patch recently got a --cover-letter option which does the
same thing. I actually use a real MUA (mutt) instead of send-email, and
this way you can avoid the message-id cutting and pasting that is
required. It automatically does the right thing with encodings because I
end up sending the message using my MUA.

> E-mail agents KMail and Mutt have setting for preferred encodings for 
> outgoing mail. It's a list of encodings, 
> like "us-ascii,iso-8859-1,utf-8". The first one that fits (including 
> From, To, Cc, Subject, the body, ...?) is used, so there is some kind 
> of detection of content after the message has been composed.

Yes, the git-send-email code is a real mess for this sort of thing. I
think it started very small and specific, and has gotten hack upon hack
piled on it. It would be much nicer rewritten from scratch around one of
the many abstracted perl mail objects (though that does introduce a new
dependency).

> If portable content encoding detection is difficult or considered 
> unnecessary, then I think a documented configurable option is fine 
> (UTF-8 by default).

I think that is sensible. Want to try adding it on top of my patches?

Below is the revised subject-munging patch.

-- >8 --
send-email: rfc2047-quote subject lines with non-ascii characters

We always use 'utf-8' as the encoding, since we currently
have no way of getting the information from the user.

This also refactors the quoting of recipient names, since
both processes can share the rfc2047 quoting code.

Signed-off-by: Jeff King <peff@xxxxxxxx>
---
 git-send-email.perl   |   20 ++++++++++++++++++--
 t/t9001-send-email.sh |   15 +++++++++++++++
 2 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/git-send-email.perl b/git-send-email.perl
index 7c4f06c..3694f81 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -536,6 +536,15 @@ EOT
 		if (!$in_body && /^MIME-Version:/i) {
 			$need_8bit_cte = 0;
 		}
+		if (!$in_body && /^Subject: ?(.*)/i) {
+			my $subject = $1;
+			$_ = "Subject: " .
+				($subject =~ /[^[:ascii:]]/ ?
+				 quote_rfc2047($subject) :
+				 $subject) .
+				"\n";
+			}
+		}
 		print C2 $_;
 	}
 	close(C);
@@ -626,6 +635,14 @@ sub unquote_rfc2047 {
 	return wantarray ? ($_, $encoding) : $_;
 }
 
+sub quote_rfc2047 {
+	local $_ = shift;
+	my $encoding = shift || 'utf-8';
+	s/([^-a-zA-Z0-9!*+\/])/sprintf("=%02X", ord($1))/eg;
+	s/(.*)/=\?$encoding\?q\?$1\?=/;
+	return $_;
+}
+
 # use the simplest quoting being able to handle the recipient
 sub sanitize_address
 {
@@ -643,8 +660,7 @@ sub sanitize_address
 
 	# rfc2047 is needed if a non-ascii char is included
 	if ($recipient_name =~ /[^[:ascii:]]/) {
-		$recipient_name =~ s/([^-a-zA-Z0-9!*+\/])/sprintf("=%02X", ord($1))/eg;
-		$recipient_name =~ s/(.*)/=\?utf-8\?q\?$1\?=/;
+		$recipient_name = quote_rfc2047($recipient_name);
 	}
 
 	# double quotes are needed if specials or CTLs are included
diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh
index e222c49..a4bcd28 100755
--- a/t/t9001-send-email.sh
+++ b/t/t9001-send-email.sh
@@ -210,4 +210,19 @@ test_expect_success '--compose respects user mime type' '
 	! grep "^Content-Type: text/plain; charset=utf-8" msgtxt1
 '
 
+test_expect_success '--compose adds MIME for utf8 subject' '
+	clean_fake_sendmail &&
+	echo y | \
+	  GIT_EDITOR=$(pwd)/fake-editor \
+	  GIT_SEND_EMAIL_NOTTY=1 \
+	  git send-email \
+	  --compose --subject utf8-sübjëct \
+	  --from="Example <nobody@xxxxxxxxxxx>" \
+	  --to=nobody@xxxxxxxxxxx \
+	  --smtp-server="$(pwd)/fake.sendmail" \
+	  $patches &&
+	grep "^fake edit" msgtxt1 &&
+	grep "^Subject: =?utf-8?q?utf8-s=C3=BCbj=C3=ABct?=" msgtxt1
+'
+
 test_done
-- 
1.5.5.rc1.123.ge5f4e6

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux