Re: Converting ISO 8859-2 characters

"Peter Seitz" <seitz@bzs.tu-graz.ac.at> · Sun, 21 Jan 2001 01:03:10 +0100

On Fri, 19 Jan 2001 19:02:53 -0800
   Earl Hood <ehood@hydra.acs.uci.edu> wrote:

> On January 19, 2001 at 22:30, "Peter Seitz" wrote:
>
> > <CharsetConverters>
> > plain;          mhonarc::htmlize;
> > us-ascii;       mhonarc::htmlize;
> > iso-8859-1;     mhonarc::htmlize;
> > iso-8859-2;     iso_8859::str2sgml;     iso8859-ref.pl
> > iso-8859-3;     iso_8859::str2sgml;     iso8859.pl
> > iso-8859-4;     iso_8859::str2sgml;     iso8859.pl
> > iso-8859-5;     iso_8859::str2sgml;     iso8859.pl
> > iso-8859-6;     iso_8859::str2sgml;     iso8859.pl
> > iso-8859-7;     iso_8859::str2sgml;     iso8859.pl
> > iso-8859-8;     iso_8859::str2sgml;     iso8859.pl
> > iso-8859-9;     iso_8859::str2sgml;     iso8859.pl
> > iso-8859-10;    iso_8859::str2sgml;     iso8859.pl
> > default;        -ignore-
> > </CharsetConverters>
> >
> > But the result is not what I've thought. The characters still are
> > converted using the entities. I am stuck here.
>
> Use a different package/function name.  What is happening is that
> which ever library is read last, iso8859.pl or iso8859-ref.pl,
> the last library's function definition will override the other.
>
> Change iso8859-ref.pl to use a package name of "iso_8859_ref",
> and register it as:
>
> iso-8859-2;     iso_8859_ref::str2sgml;     iso8859-ref.pl

I've done as you've told, but I still have problems. I don't know if
it's a bug or a feature.

Implementing the CharsetConverters as you've told above only converts
the headers of the mail correctly with the references:

To: "=?ISO-8859-2?Q?nov=FD =E8len?=" <test@fbzslinux.tu-graz.ac.at>

gets converted to:

<LI><em>To</em>: "nov&#253; &#269;len" &lt;<A HREF="mailto:test@fbzslinux.tu%2Dgraz.ac.at";>test@fbzslinux.tu-graz.ac.at</A>&gt;</LI>

But the message body still uses the entities defined in the iso8859.pl
file.

I've fiddled a little around with the perl files from the
distribution. Changing the reference for the converter in the
mhinit.pl file works like I've expected:

##  Charset filters
##
%readmail::MIMECharSetConverters = (
    # Character set         Converter Function
    #-------------------------------------------------------------------
    "plain",                "mhonarc::htmlize",
    "us-ascii",             "mhonarc::htmlize",
    "iso-8859-1",               "mhonarc::htmlize",
#    "iso-8859-2",               "iso_8859::str2sgml",
    "iso-8859-2",               "iso_8859_ref::str2sgml",
    "iso-8859-3",               "iso_8859::str2sgml",
# [...]
    "default",              "-ignore-",
);
%readmail::MIMECharSetConvertersSrc = (
    # Character set         Converter Function
    #-------------------------------------------------------------------
    "plain",                undef,
    "us-ascii",             undef,
    "iso-8859-1",               undef,
#    "iso-8859-2",               "iso8859.pl",
    "iso-8859-2",               "iso8859-ref.pl",
# [...]
    "default",              undef,
);

With this setting, also the mail body gets translated correctly.

This brings me to the assumption that the definitions from the
resourcefile is not considered when converting the message body.

I was not able to find out where the settings from the resourcefile
are used when converting because I am only a perl learner. So I guess
I have to wait for Earl to commend on this issue.

Thanks in advance

With best compliments

           Peter Seitz
--

  Graz University of Technology, Austria - Fac. f. Civil Engineering
  mailto:seitz@bzs.tu-graz.ac.at - http://wwwbzs.tu-graz.ac.at/~seitz/

            Member of the Pegasus Mail Support Group
          Coordinator of the Pmail Translation Process

For information about translating Pegasus Mail, contact:
Han van den Bogaerde or Peter Seitz at
translation-coordinator@pmail.gen.nz