Search Postgresql Archives

Re: invalid byte sequence for encoding "UTF8"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Mar 21, 2007 at 09:54:41AM -0700, Alan Hodgson wrote:
> iconv needs to read the whole file into RAM.  What you can do is use the 
> UNIX split utility to split the dump file into smaller segments, use iconv 
> on each segment, and then cat all the converted segments back together into 
> a new dump file.  iconv is I think your best option for converting the dump 
> to a valid encoding.

The guys at openstreetmap have written a UTF-8 cleaner that doesn't
read the whole file into memory:

http://trac.openstreetmap.org/browser/utils/planet.osm/C

Definitly more convenient for large files.

Have a nice day,
-- 
Martijn van Oosterhout   <kleptog@xxxxxxxxx>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Attachment: signature.asc
Description: Digital signature


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux