Header encoding/charset issue in 1.4.23-svn

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



NOTE:  SourceForge is currently having problems with their mailing
lists.  This is a message originally from Juergen Nickelsen sent on
Thu, 13 Nov 2014 14:13:49 +0100
=====

Hello all,

background: I run two SquirrelMail instances, one for our university
members, the other for an academic government institute. Both are
currently 1.4.22 (with a few, but different, local changes each) on
Debian "Squeeze". Impending OS upgrade to "Wheezy" implies PHP 5.4,
which AFAIK means I have to move to 1.4.23-svn.

Rebasing our local changes on the squirrelmail-20141105_0200-SVN.stable
snapshot seemed to be successful, except for one thing:

Header lines (e.g. From, Subject) that are encoded in a charset (test
case: iso-8859-1) that is not SquirrelMail's $default_charset (here:
utf-8) are not decoded correctly. The error message in the log is "PHP
Warning:  htmlspecialchars(): Invalid multibyte sequence in argument in
/home/webmail/src/squirrel/functions/strings.php on line 1512".
SquirrelMail then does not display the corresponding header contents in
the message list or the message display, rather "(unknown)".

This issue is not only present in our locally patched versions, but also
in the version in the Debian package of SquirrelMail for "Wheezy", which
is claimed to be 2:1.4.23~svn20120406-2, so apparently 2.5 years older,
as well as in a very sparingly-configured unpatched installation of
today's 1.4.23-svn snapshot.

The decisive point seems to be the value of $default_charset versus the
header encoding -- only when I set it to 'utf-8', the problem appears
with iso-8859-1 headers, although not with utf-8 headers.

I tracked the issue down to functions/i18n.php:charset_encode(), where
the charset used in the header line is not passed to
sm_encode_html_special_chars(), so htmlspecialchars() is then called
with the default encoding.

This patch seems to fix the problem:

diff --git a/functions/i18n.php b/functions/i18n.php
index ec19c25..edbc6d6 100644
--- a/functions/i18n.php
+++ b/functions/i18n.php
@@ -184,7 +184,7 @@ function charset_decode ($charset, $string,
$force_decode=3Dfalse, $save_html=3Dfals
     }

     /* All HTML special characters are 7 bit and can be replaced first *=
/
-    if (! $save_html) $string =3D sm_encode_html_special_chars ($string)=
;
+    if (! $save_html) $string =3D sm_encode_html_special_chars ($string,=

ENT_COMPAT | ENT_HTML401, $charset);
     $charset =3D strtolower($charset);

     set_my_charset();

I have attached a message that I have used to reproduce the problem. The
>From and Subject header lines are encoded in iso-8859-1, while
SquirrelMail's configured charset is utf-8. (I have also changed the
language charset for the languages we use (de_DE, en_US) to utf-8.
Apparently this is irrelevant, though.)

Perhaps I am "doing it wrong", but we need UTF-8, as the university
members and their correspondents come in all shapes, sizes, colors, and
languages.
------------------------------------------------------------------------------
New Year. New Location. New Benefits. New Data Center in Ashburn, VA.
GigeNET is offering a free month of service with a new server in Ashburn.
Choose from 2 high performing configs, both with 100TB of bandwidth.
Higher redundancy.Lower latency.Increased capacity.Completely compliant.
http://p.sf.net/sfu/gigenet
-----
squirrelmail-users mailing list
Posting guidelines: http://squirrelmail.org/postingguidelines
List address: squirrelmail-users@xxxxxxxxxxxxxxxxxxxxx
List archives: http://news.gmane.org/gmane.mail.squirrelmail.user
List info (subscribe/unsubscribe/change options): https://lists.sourceforge.net/lists/listinfo/squirrelmail-users

[Index of Archives]     [Video For Linux]     [Yosemite News]     [Yosemite Photos]     [gtk]     [KDE]     [Cyrus SASL]     [Gimp on Windows]     [Steve's Art]     [Webcams]

  Powered by Linux