Search Postgresql Archives

Re: [External] postgres 9.5 DB corruption: invalid byte sequence for encoding "UTF8"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello again Brad, and hello Tom. Thanks for writing, and Tom I'll add some extra thanks for the many google searches over the years which have been answered by your posts. :)

It seems there's been some confusion on the data flow so I'll try to explain. The original "source" I referred to was our JVM generating alert data and performing DML on the DB via the v42.0.0 driver. I do not know of any encoding setting for the driver itself.

The driver writes directly to our primary (Slony-I master) DB. Changes are then replicated by our replica (Slony-I subscriber) DBs. Each of these DBs has a "UTF8" server encoding. COPY out ops (performed by Slony-I or anything else) always work fine, and COPY in ops fail if there is data corruption.

ams@xxxxxxxx.netmgmt:/a/ams/lib/ext/database$ sum postgresql-42.0.0.jar 

12476   691

ams@xxxxxxxx.netmgmt:/a/ams/lib/ext/database$ 

ams@xxxxxxxx.netmgmt:~$ echo ---- Slony-I Master ----

---- Slony-I Master ----

ams@xxxxxxxx.netmgmt:~$ psql -U akamai -d ams -c "SELECT datname, pg_encoding_to_char(encoding) FROM pg_database WHERE datname = 'ams'"

 datname | pg_encoding_to_char 

---------+---------------------

 ams     | UTF8

(1 row)


ams@xxxxxxxx.netmgmt:~$ 

root@xxxxxxxxxxxxx.netmgmt:~# echo ---- Slony-I Replica ----

---- Slony-I Replica ----

root@xxxxxxxxxxxxx.netmgmt:~# psql -U akamai -d ams -c "SELECT datname, pg_encoding_to_char(encoding) FROM pg_database WHERE datname = 'ams'"

 datname | pg_encoding_to_char 

---------+---------------------

 ams     | UTF8

(1 row)


root@xxxxxxxxxxxxx.netmgmt:~# 



Tom    :-)


On Tuesday, March 26, 2019, 10:14:59 AM EDT, Tom Lane <tgl@xxxxxxxxxxxxx> wrote:


"Brad Nicholson" <bradn@xxxxxxxxxx> writes:
> Thomas Tignor <tptignor@xxxxxxxxx> wrote on 03/25/2019 08:25:49 PM:
>> Thanks for writing. As I mentioned to Vijay, the "source" is a JVM
>> using the postgres v42.0.0 JDBC driver. I do not believe we have any
>> explicit encoding set, and so I expect the client encoding is
>> SQL_ASCII. The DB is most definitely UTF8.

> These statements are contradictory.

> The value of client_encoding from your select on pg_settings is SQL_ASCII.

> The docs clearly state:
> https://www.postgresql.org/docs/9.5/runtime-config-client.html
> "Sets the client-side encoding (character set). The default is to use the
> database encoding. "

> If you don't have client_encoding explicitly, then it is using the database
> encoding.

Umm ... not necessarily.  That bit in runtime-config-client.html
correctly states what the *server's* default for client_encoding is,
but an awful lot of client-side code will immediately override that.
psql will try to set it based on its LANG/LC_CTYPE environment,
for example.  I don't know what JDBC does; it might be different.

But in any case, yes, it'd be more reliable to check pg_database.encoding

explicitly.


            regards, tom lane



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux