Re: PHP + PostgreSQL: invalid byte sequence for encoding "UTF8"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> Please configure your email client so we don't receive 5 copies of your
> mail.

Just fixed that issue, don't be afraid of that in the future.

> This indicates that PHP not using UTF-8.  That output is typical of
> UTF-8 output as Latin characters.

Well, maybe the output is not correct - when running the php script on console (cli) it outputs me the content in the wrong charset, but that's not the problem, doing a utf8_decode() lets me output it in the right charset.

> Not true, it only indicates that phpPgAdmin is is configured to handle
> UTF-8 correctly.

Well, I searched all the source code of phpPgAdmin for charsets and I found:

"echo "\t<meta http-equiv=\"Content-Type\" content=\"text/html; charset={$data->codemap[$dbEncoding]}\" />\r\n";"

So this means, phpPgAdmin sets the output charset to the charset which is used by the databased connected to - but that's still not the problem, because I also know how to fix charset output in browsers.

> Once again indicating your data needs to be converted from some other
> character set.

It's already converted to be compatible to utf8 when fetching it from some other ressources.

> I had similar problems getting PHP to work with UTF-8 and MySQL.  Many
> of PHP's function are not multibyte aware and assume a Latin character set.
> What, if any, output buffering are you using? What is your
> default_charset set to?

Well, I've set the default_charset to UTF8, it was set before to "" (empty) - but the output on console (cli) and the problem is still the same also after changing this to UTF8, so: this is not the problem, and I don't need proper output on console without utf8_decode() - if I want proper output there I just do a decode, like I do when I want it to get outputed in the browser properly.

Maybe a cleaner explanation of the problem:

I fetch something from database, which looks like "lacarrière" when I output it in PHP - well don't let us get confused from PHPs output. Then I fetch something from another ressource looking like "lacarrière" - when I compare both strings in PHP it tells me that they are "not equal".

So I HAVE TO do either an utf8_encode() on the string from the other ressource OR a utf8_decode() on the string from the database to compare them as "equal".

...and THIS means a lot of more code in my classes.

Hint: The other ressource is a socket connection (API) to another server.

The problem is quite simple I think, everything comming from the database is UTF8-byte encoded and needs to get UTF8-Decoded before you can work with it properly.

The default_charset seems to work only on output buffer, so the solution for that problem could only be a mechanism to tell PHP handling all strings UTF8 byte encoded, which should mean a lot of more ressources to be taken for this process - I understand that this is not a solution.

So the only solutions could be: 

a) Decode and encode properly utf8 stuff and to take care if the content is utf8-byte encoded so it needs to be decoded before using it properly with other strings

b) A mechanism to tell the pg-functions in PHP to decode all data which is UTF8-Encoded. The ADODB-Layers seems to do that properly, but the pg-functions don't do that as I can see.

You can use this to reproduce it:

1. Create a table in postgres, on a UTF8 initialized database, insert something like "lacarrière" into it. Check if it's inserted correctly..

2. Check with psql the normal output, you should get either "lacarrière" or "lacarrière" so you can be sure it's inserted correctly.

3. Make a script which fetchs the string from the database to $dbString. 

4. Set a string $phpString = "lacarrière";

5. Compare both strings with "==" - you'll get "false"

Another hint:

Try to send "select 'lacarrière' as test;' with pg_query to any postgres database, you'll get an error, if not... well, then I'm wrong and I've set up PHP wrong to handle UTF8-stuff.

If you send "select '".utf8_encode(lacarrière)."' as test;" to your database this should work.

Also the above meant $phpString is NOT EQUAL to the result you would get from "select '".utf8_encode(lacarrière)."' as test;", you would need to compare it to utf8_decode($dbString) to be EQUAL.

-- 
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [PHP Users]     [Postgresql Discussion]     [Kernel Newbies]     [Postgresql]     [Yosemite News]

  Powered by Linux