Search Postgresql Archives

Re: Best practices for moving UTF8 databases

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Phoenix Kiula wrote:
I tried this. Get an error.


mypg=# select * from interesting WHERE NOT description ~ ( '^('||
mypg(#    $$[\09\0A\0D\x20-\x7E]|$$||               -- ASCII
mypg(#    $$[\xC2-\xDF][\x80-\xBF]|$$||             -- non-overlong 2-byte
mypg(#     $$\xE0[\xA0-\xBF][\x80-\xBF]|$$||        -- excluding overlongs
mypg(#    $$[\xE1-\xEC\xEE\xEF][\x80-\xBF]{2}|$$||  -- straight 3-byte
mypg(#     $$\xED[\x80-\x9F][\x80-\xBF]|$$||        -- excluding surrogates
mypg(#     $$\xF0[\x90-\xBF][\x80-\xBF]{2}|$$||     -- planes 1-3
mypg(#    $$[\xF1-\xF3][\x80-\xBF]{3}|$$||          -- planes 4-15
mypg(#     $$\xF4[\x80-\x8F][\x80-\xBF]{2}$$||      -- plane 16
mypg(#   '*)$' )
mypg-#
mypg-#   ;
ERROR:  invalid regular expression: quantifier operand invalid

If you really don't want to go the "pg_dump -> iconv (remove invalid characters) -> diff the dump files" route, a stored procedure that searches for invalid characters was posted a few years back that attempts to find the invalid characters.

http://archives.postgresql.org/pgsql-hackers/2005-12/msg00511.php

http://svana.org/kleptog/pgsql/utf8_verify.sql

--
Justin Pasher

--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux