On 8/15/07, Phoenix Kiula <phoenix.kiula@xxxxxxxxx> wrote: > On 15/08/07, Ivan Zolotukhin <ivan.zolotukhin@xxxxxxxxx> wrote: > > Hello, > > > > Actually I tried smth like $str = @iconv("UTF-8", "UTF-8//IGNORE", > > $str); when preparing string for SQL query and it worked. There's > > probably a better way in PHP to achieve this: simply change default > > values in php.ini for these parameters: > > > > mbstring.encoding_translation = On > > mbstring.substitute_character = none > > > > and broken symbols will be automatically stripped off from the input > > and output. > > > Sadly, they don't always do that, not with Asian scripts. > > And I do not completely agree, like the other poster suggested, with > the concept of GIGO. Sometimes you want the end-user's experience to > be seamless. For example, in one of our web sites, we allow users to > submit text through a bookmarklet, where the title of the webpage > comes in rawurlencoded format. We try to rawurldecode() it on our end > but most of the times the Asian interpretation is wrong. We have all > the usual mbstring settings in php.ini. In this scenario, the user did > not enter any garbage. Our application should have the ability to > recognize the text. We do what we can with mb_convert...etc, but the > database just throws an error. > > PGSQL really needs to get with the program when it comes to utf-8 input. What, exactly, does that mean? That PostgreSQL should take things in invalid utf-8 format and just store them? Or that PostgreSQL should autoconvert from invalid utf-8 to valid utf-8, guessing the proper codes? Seriously, what do you want pgsql to do with these invalid inputs? ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster