Search Postgresql Archives

Re: client_encoding issue with SQL_ASCII on 8.3 to 10 upgrade

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 04/16/2018 08:16 AM, Keith Fiske wrote:
Running into an issue with helping a client upgrade from 8.3 to 10 (yes, I know, please keep the out of support comments to a minimum, thanks :).

The old database was in SQL_ASCII and it needs to stay that way for now unfortunately. The dump and restore itself works fine, but we're now running into issues with some data returning encoding errors unless we specifically set the client_encoding value to SQL_ASCII.

Looking at the 8.3 database, it has the client_encoding value set to UTF8 and queries seem to work fine. Is this just a bug in the old 8.3 not enforcing encoding properly?e

AFAIK, SQL_ASCII basically means no encoding:

https://www.postgresql.org/docs/10/static/multibyte.html

"The SQL_ASCII setting behaves considerably differently from the other settings. When the server character set is SQL_ASCII, the server interprets byte values 0-127 according to the ASCII standard, while byte values 128-255 are taken as uninterpreted characters. No encoding conversion will be done when the setting is SQL_ASCII. Thus, this setting is not so much a declaration that a specific encoding is in use, as a declaration of ignorance about the encoding. In most cases, if you are working with any non-ASCII data, it is unwise to use the SQL_ASCII setting because PostgreSQL will be unable to help you by converting or validating non-ASCII characters."


What client are you working with?

If psql then its behavior has changed between 8.3 and 10:

https://www.postgresql.org/docs/10/static/release-9-1.html#id-1.11.6.121.3

"

Have psql set the client encoding from the operating system locale by default (Heikki Linnakangas)

This only happens if the PGCLIENTENCODING environment variable is not set.
"

https://www.postgresql.org/docs/10/static/app-psql.html

"If both standard input and standard output are a terminal, then psql sets the client encoding to “auto”, which will detect the appropriate client encoding from the locale settings (LC_CTYPE environment variable on Unix systems). If this doesn't work out as expected, the client encoding can be overridden using the environment variable PGCLIENTENCODING."



The other thing I noticed on the 10 instance was that, while the LOCALE was set to SQL_ASCII, the COLLATE and CTYPE values for the restored databases were en_US.UTF-8. Could this be having an affect? Is there any way to see what these values were on the old 8.3 database? The pg_database catalog does not have these values stored back then.

--
Keith Fiske
Senior Database Engineer
Crunchy Data - http://crunchydata.com


--
Adrian Klaver
adrian.klaver@xxxxxxxxxxx




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux