Phoenix Kiula wrote: > Really, PG absolutely needs a way to upgrade the database without so > much data related downtime and all these silly woes. Several competing > database systems are a cinch to upgrade. I'd call it data corruption, not a silly woe. I know that Oracle for example would not make that much fuss about your data: they would be imported without even a warning, and depending on your encoding settings the bad bytes would either be imported as-is or tacitly changed to inverted (or normal) question marks. It's basically a design choice that PostgreSQL made: we think that an error is preferrable to clandestinely modifying the user's data or accepting input that cannot possibly make any sense when it is retrieved at a future time. > Anyway this is the annoying error I see as always: > > ERROR: invalid byte sequence for encoding "UTF8": 0x80 > > I think my old DB is all utf8. If there are a few characters that are > not, how can I work with this? I've done everything I can to take care > of the encoding and such. This code was used to initdb: > > initdb --locale=en_US.UTF-8 --encoding=UTF8 > > Locale environment variables are all "en_US.UTF-8" too. "0x80" makes me think of the following: The data originate from a Windows system, where 0x80 is a Euro sign. Somehow these were imported into PostgreSQL without the appropriate translation into UTF-8 (how I do not know). I wonder: why do you spend so much time complaining instead of simply locating the buggy data and fixing them? This does not incur any downtime (you can fix the data in the old database before migrating), and it will definitely enhance the fun your users have with your database (if they actually see Euros where they should be). Yours, Laurenz Albe -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general