Search Postgresql Archives

Re: Changing collate & ctype for an existing database

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 07/12/2017 01:54 PM, Albe Laurenz wrote:
rihad wrote:
Hi there. We have a working database that was unfortunately created by
initdb with default ("C") collation & ctype. All other locale specific
settings have the value en_US.UTF-8 in postgresql.conf. The database
itself is multilingual and all its data is stored in UTF-8. Sorting
doesn't work correctly, though. To fix that, can I just do this:


update pg_database set datcollate='en_US.UTF-8', datctype='en_US.UTF-8'
where datname='mydb';


This does seem to work on a testing copy of the database, i.e. select
lower('БлаБлаБла') now works correctly when connected to that database.


Is there still any chance for corrupting data by doing this, or indexes
stopping working etc?

p.s.: postgres 9.6.3
As explained, yes.  Indexes on string columns will be corrupted.

See this example:

test=# CREATE DATABASE breakme LC_COLLATE "C" LC_CTYPE "C" TEMPLATE template0;
test=# \c breakme
breakme=# CREATE TABLE sort(id integer PRIMARY KEY, val text NOT NULL);
breakme=# INSERT INTO sort VALUES (1, 'LITTLE'), (2, 'big'), (3, 'b-less');
breakme=# CREATE INDEX ON sort(val);
breakme=# SET enable_seqscan=off;  -- force index use
breakme=# SELECT * FROM sort ORDER BY val;
┌────┬────────┐
│ id │  val   │
├────┼────────┤
│  1 │ LITTLE │
│  3 │ b-less │
│  2 │ big    │
└────┴────────┘
(3 rows)

breakme=# UPDATE pg_database SET datcollate='en_US.UTF-8', datctype='en_US.UTF-8' WHERE datname='breakme';
breakme=# \c breakme
breakme=# SET enable_seqscan=off;  -- force index use
breakme=# SELECT * FROM sort ORDER BY val;
┌────┬────────┐
│ id │  val   │
├────┼────────┤
│  1 │ LITTLE │
│  3 │ b-less │
│  2 │ big    │
└────┴────────┘
(3 rows)

breakme=# SET enable_seqscan=on;  -- this and the following force sequential scan
breakme=# SET enable_bitmapscan=off;
breakme=# SET enable_indexscan=off;
breakme=# SET enable_indexonlyscan=off;
breakme=# SELECT * FROM sort ORDER BY val;  -- this returns the correct order
┌────┬────────┐
│ id │  val   │
├────┼────────┤
│  2 │ big    │
│  3 │ b-less │
│  1 │ LITTLE │
└────┴────────┘
(3 rows)

As you see, your index is still sorted according to the C collation
and scanning it returns wrong results.

Yours,
Laurenz Albe

This ordering issue can certainly be classified as an inconsistency, but nothing to lose sleep over. Is this all that is normally meant when saying "index corruption"? What about updating or deleting the wrong row addressed by the textual index that hasn't been rebuilt after datcollate/datctype change, complete table/database corruption, or other scary night-time stories of this kind? Possible?



--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux