XML, control characters and MHonArc

I've recently been looking at revamping an archive and having MHonArc
output XML which is then pulled into a PHP based application using

Mostly this is working fine, but I have the occasional problem with
control characters in badly formatted emails. Specifically, a QP email
with the string =12 - MHonArc outputs the associated control character
to the XML. These characters are not valid in XML and the XML parser
chokes on them.

I see a quick mention of a similar problem back in 2000:

Have things changed? Is there any way short of writing a custom filter,
or hacking/patching an existing one, that I can persuade MHonArc to
strip out XML illegal control characters?

If not, any hints on where to start hacking?


Chris Hastie

