Rick Schumeyer skrev:
I will have to try the WIN1252 encoding.
On the client side, my application is a web browser. On the server
side, it is php scripts on a linux box. The data comes from copying
data from a browser window (pointing to another web site) and pasting it
into an html textarea, which is then submitted.
Given this, would you still suggest the WIN1252 encoding?
In my setup I compiled php with
--enable-zend-multibyte
...which makes all strings unicode internally (I suppose they use
wchar_t instead of char or something). Thus mb_*() are [from what I can
tell] not necessary [for me] anymore. Do use a fairly recent php, not
only for bind variables in the pg api.
In php.ini i've got
default_charset = "utf-8"
mbstring.internal_encoding = UTF-8;
in the html head:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
The db is in utf-8.
Flawlessly it has saved everything I've tossed at it, including all
sorts of apostrophes. I've copy & pasted chinese, hebrew, swedish,
arabic... texts into <textarea> with no other problem that hebrew and
arabic makes most sense written from right to left ;-)
Best regards,
Marcus