On 07/22/2012 03:58 PM, Herouth Maoz
wrote:
As far as I know, PostgreSQL's encoding handling functions do not offer substitution for unsupported characters, nor does the built-in client<->server charset translation feature. You could do it with a regular _expression_ replacement of any character not in a class that contains every char in valid in the target encoding. It feels like a very clunky approach though. An alternative is to use a procedural language that DOES support lossy character encoding conversions. I don't think plpython does and plpgsql certainly doesn't if PostgreSQL its self doesn't. I'd be amazed if plperl didn't support lossy conversions, but I haven't done much with Perl in years. It'd be handy if Pg's client<->server conversion supported lossy conversions for this kind of thing. Honestly I'm not sad it doesn't, because it'd be something people would misuse to make the error messages they didn't understand go away - then come back and complain that PostgreSQL ate their data later. -- Craig Ringer |