Search Postgresql Archives

Re: finding bogus UTF-8

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Vick Khera wrote:
On Tue, Feb 15, 2011 at 11:09 AM, Geoffrey Myers
<lists@xxxxxxxxxxxxxxxxxxxxx> wrote:
comments would be appreciated.


If all you're doing is filtering stdin to stdout and deleting a range
of characters, it seems that tr would be a faster tool:

cat foo.txt | tr -d '\000-\008\013-\037\177-\377' > foo-cleaned.txt

I toyed with tr for a bit, but could not get it to work. The above did not work for me either. Not exactly sure what it's doing, but here's a couple of diff lines:


1619c1619
<     days integer DEFAULT 28,
---
>     days integer DEFAULT 2,


So it appears 'tr' is deleting the '8' character, rather then the octal value for 008.


--
Until later, Geoffrey

"I predict future happiness for America if they can prevent
the government from wasting the labors of the people under
the pretense of taking care of them."
- Thomas Jefferson

--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux