Re: Unicode, php and postgresql

Michael Glaesemann <grzm@xxxxxxxxxxxxx> · Tue, 9 Dec 2003 17:59:50 +0900

Hi Didier!

On Tuesday, December 9, 2003, at 05:34 PM, Didier Bretin wrote:

Hi,

I try to install a 7.4.0 + php for developping an application in 
unicode.
Apparently I have no problem ;).

But I don't understand enough the documentation of php. My postgresql
server is configured in unicode, and my database is entirely in 
unicode.
In my php.ini file I set no mbstring variables. When I'm connecting to 
the
database, I SELECT the data and then I print them, with the charset
utf-8, to the browser and all the characters are correctly displayed.

My question is : is it the right way I don't have to configure anything
in php for dealing with unicode :) ?

In my (admittedly limited) experience with PHP 4, Unicode, and 
PostgreSQL, you can go a long way with the setup you descibe, i.e., not 
using multi-byte string functions. However, all I do is move info in 
and out of the database: I'm not doing any fancy-pants parsing of the 
data in PHP—including data sanity checking (besides preventing SQL 
insertion). I would *not* recommend doing it as I've done, though it 
does work for me. It's something I'm working on rectifying in my own 
code, and rather than have to fix it later, I'd recommend doing it 
right the first time.

The reason it works is that PHP (at least as of PHP4) is agnostic about 
the strings. It just takes it from the database and hands them to your 
code, not trying to read it, parse it, check it, anything unless you 
explicitly do so in the code.

Again, I don't recommend this (though I've been doing it myself) 
because I don't believe you'll be able to do proper data 
checking—especially if you're using higher order (i.e., not ASCII) code 
points. For me, this means the Japanese that moves into my database is 
completely unchecked, and like I said, that's Not Good. To do proper 
checking of the Japanese, I'd need to use $mb_string functions.

I'm interested in hearing other's opinions on this as well, 
particularly if they think I'm wrong—I can always learn something!

hth

Michael Glaesemann
grzm myrealbox com