Handling illegal byte sequences in UTF-8 strings

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello list,

We have recently upgraded our database to PostgreSQL 8.1.x which
handles UTF-8 more strictly than previous versions. The new version
will not allow illegal byte sequences when inserting data.

This has caused some errors in our system which inputs data.
Basically, what the system does is insert data which is copy-pasted
from OpenOffice.org files. The content of the OpenOffice.org files are
likewise pasted from various websites which may or may not be using
UTF-8 encoding.

After some research, I have looked at both iconv and mbstring (I might
use iconv since it's there by default). But nonetheless, someone on
the list may have a better way of handling this issue.

What then would be the best way to handle illegal byte sequences
before they are inserted into the database?


--
Stand before it and there is no beginning.
Follow it and there is no end.
Stay with the ancient Tao,
Move with the present.

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux