Search Postgresql Archives

Re: client_encoding issue with SQL_ASCII on 8.3 to 10 upgrade

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On Mon, Apr 16, 2018 at 12:09 PM, Tom Lane <tgl@xxxxxxxxxxxxx> wrote:
Keith Fiske <keith.fiske@xxxxxxxxxxxxxxx> writes:
> Running into an issue with helping a client upgrade from 8.3 to 10 (yes, I
> know, please keep the out of support comments to a minimum, thanks :).

> The old database was in SQL_ASCII and it needs to stay that way for now
> unfortunately. The dump and restore itself works fine, but we're now
> running into issues with some data returning encoding errors unless we
> specifically set the client_encoding value to SQL_ASCII.

I'm guessing you might be hitting this 9.1 change:

    * Have psql set the client encoding from the operating system locale
      by default (Heikki Linnakangas)

      This only happens if the PGCLIENTENCODING environment variable is
      not set.

I think the previous default was to set client encoding equal to the
server encoding.

> Looking at the 8.3 database, it has the client_encoding value set to UTF8
> and queries seem to work fine. Is this just a bug in the old 8.3 not
> enforcing encoding properly?

Somewhere along the line we made SQL_ASCII -> something else conversions
check that the data was valid per the other encoding, even though no
actual data change happens.

> The other thing I noticed on the 10 instance was that, while the LOCALE was
> set to SQL_ASCII,

You mean encoding, I assume.

> the COLLATE and CTYPE values for the restored databases
> were en_US.UTF-8. Could this be having an affect?

This is not a great idea, no.  You could be getting strange misbehaviors
in e.g. string comparison, because strcoll() will expect UTF8 data and
will likely not cope well with data that isn't valid in that encoding.

If you can't sanitize the encoding of your data, I'd suggest running
with lc_collate and lc_ctype set to "C".

                        regards, tom lane


Thanks to both of you Adrian & Tom.

It is the 9.1 change to the psql client that seems to be causing this. 

And pg_controldata was able to show that the CTYPE and COLLATE were UTF8 on the old system. If that's the case, do you still think it's a good idea to set the COLLATE and CTYPE to "C"?


--
Keith Fiske
Senior Database Engineer
Crunchy Data - http://crunchydata.com

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux