According to rfc2047, an encoded word has the following form: encoded-word = "=?" charset "?" encoding "?" encoded-text "?=" charset = token encoding = token token = <Any CHAR except SPACE, CTLs, and especials> especials = "(" / ")" / "<" / ">" / "@" / "," / ";" / ":" / " <"> / "/" / "[" / "]" / "?" / "." / "=" encoded-text = <Any printable ASCII character other than "?" or SPACE> And rfc822 defines CTLs as: CTL = <any ASCII control; ( 0- 37, 0.- 31.) character and DEL>; ( 177, 127.) The original code only detected rfc2047 encoded strings when the charset was UTF-8. This patch generalizes the matching expression and breaks the check for an rfc2047 encoded string into its own function. There's no real functional change, since any properly rfc2047 encoded string (the ones that weren't UTF-8) would have fallen through the remaining 'if' statements and been returned unchanged. Signed-off-by: Brandon Casey <drafnel@xxxxxxxxx> --- git-send-email.perl | 10 +++++++++- 1 files changed, 9 insertions(+), 1 deletions(-) diff --git a/git-send-email.perl b/git-send-email.perl index 3d6a982..e735815 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -772,6 +772,14 @@ sub quote_rfc2047 { return $_; } +sub is_rfc2047_quoted { + my $s = shift; + my $token = '[^][()<>@,;:"\/?.= \000-\037\177]+'; + my $encoded_text = '[!->@-~]+'; + length($s) <= 75 && + $s =~ m/^(?:"[[:ascii:]]*"|=\?$token\?$token\?$encoded_text\?=)$/o; +} + # use the simplest quoting being able to handle the recipient sub sanitize_address { @@ -783,7 +791,7 @@ sub sanitize_address } # if recipient_name is already quoted, do nothing - if ($recipient_name =~ /^("[[:ascii:]]*"|=\?utf-8\?q\?.*\?=)$/) { + if (is_rfc2047_quoted($recipient_name)) { return $recipient; } -- 1.6.3.1.9.g95405b -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html