Search Postgresql Archives

Re: finding bogus UTF-8

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Feb 10, 2011 at 1:02 PM, Scott Ribe <scott_ribe@xxxxxxxxxxxxxxxx> wrote:
> I know that I have at least one instance of a varchar that is not valid UTF-8, imported from a source with errors (AMA CPT files, actually) before PG's checking was as stringent as it is today. Can anybody suggest a query to find such values?

If you know which table and column the data is in, you can also do
something like this:

(I typed this up without checking the syntax of it.  The basic idea is
to cast the column as bytea, encode with the 'escape' method, then
grep for back-slashes).

select * from bad_table where regexp_match (encode (bad_column::bytea,
'escape'), '\\\\'));

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux