Re: unable to restore 8.2.5

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Sep 19, 2008 at 6:29 PM, Tore Halset <halset@xxxxxxxxxxx> wrote:
> Looks like I have managed to insert an illegal character into the main
> system that does not conform to UTF-8. Anything I can and should do to work
> around this issue?

I have had the same problem previously and after a lot of help from
Tom Lane basically came up to the following...

You need to basically dump your table out (or a subset containing the
row ID and column that would have the bad data) in plain text and then
parse it with a script to detect invalid UTF-8 sequences, then find
what rows the bad data is in and go and fix it.

It is either that or you drop the data inserting some other character.
 But this has obvious drawbacks.

I wrote a short ruby script that goes through a dumped file line by
line and puts each line through Iconv to parse it from UTF-8 to UTF-8,
if it fails it dumps the offending line to a log file.

A ruby script that would just print the offending row would go
something like this:

require 'iconv'
File.read(ARGV[0]).each do |line|
  begin
    Iconv.iconv('UTF-8', 'UTF-8', line)
  rescue
    puts "Failed: #{line}"
  end
end

Save that in a file (find_invalid_utf8.rb) then run it with:

$ ruby find_invalid_utf8.rb my_dumped_table.csv

It's not pretty, and just dumps the raw output to the screen, but it
might do for you.

-- 
http://lindsaar.net/
Rails, RSpec and Life blog....


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux