Search Postgresql Archives

Re: Best practice for: ERROR: invalid byte sequence for encoding "UTF8"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 15/08/07, Ivan Zolotukhin <ivan.zolotukhin@xxxxxxxxx> wrote:
> Hello,
>
> Actually I tried smth like $str = @iconv("UTF-8", "UTF-8//IGNORE",
> $str); when preparing string for SQL query and it worked. There's
> probably a better way in PHP to achieve this: simply change default
> values in php.ini for these parameters:
>
> mbstring.encoding_translation = On
> mbstring.substitute_character = none
>
> and broken symbols will be automatically stripped off from the input
> and output.


Sadly, they don't always do that, not with Asian scripts.

And I do not completely agree, like the other poster suggested, with
the concept of GIGO. Sometimes you want the end-user's experience to
be seamless. For example, in one of our web sites, we allow users to
submit text through a bookmarklet, where the title of the webpage
comes in rawurlencoded format. We try to rawurldecode() it on our end
but most of the times the Asian interpretation is wrong. We have all
the usual mbstring settings in php.ini. In this scenario, the user did
not enter any garbage. Our application should have the ability to
recognize the text. We do what we can with mb_convert...etc, but the
database just throws an error.

PGSQL really needs to get with the program when it comes to utf-8 input.

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux