Search Postgresql Archives

Re: Why is an ISO-8859-8 database allowing values not within that set?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 07/22/2012 03:58 PM, Herouth Maoz wrote:
Thanks. That makes sense. The default client encoding on the reports database is ISO-8859-8, so I guess when I don't set it using \encoding, it does exactly what you say.

OK, so I'm still looking for a way to convert illegal characters into something that won't collide with my encoding (asterisks or whatever).


As far as I know, PostgreSQL's encoding handling functions do not offer substitution for unsupported characters, nor does the built-in client<->server charset translation feature. You could do it with a regular _expression_ replacement of any character not in a class that contains every char in valid in the target encoding. It feels like a very clunky approach though.

An alternative is to use a procedural language that DOES support lossy character encoding conversions. I don't think plpython does and plpgsql certainly doesn't if PostgreSQL its self doesn't. I'd be amazed if plperl didn't support lossy conversions, but I haven't done much with Perl in years.

It'd be handy if Pg's client<->server conversion supported lossy conversions for this kind of thing. Honestly I'm not sad it doesn't, because it'd be something people would misuse to make the error messages they didn't understand go away - then come back and complain that PostgreSQL ate their data later.

--
Craig Ringer

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux