raf <raf@xxxxxxx> writes: > On Mon, Sep 14, 2020 at 05:39:57PM -0400, Tom Lane <tgl@xxxxxxxxxxxxx> wrote: >> On the other hand, the very same thing could be said of database names >> and role names, yet we have never worried much about whether those were >> encoding-safe when viewed from databases with different encodings, nor >> have there been many complaints about the theoretical unsafety. So maybe >> this is just overly anal-retentive and we should drop the restriction, >> or at least pass through data that doesn't appear to be invalidly >> encoded. > Perhaps recode database/role names from the source > database's encoding into utf8, and then recode from utf8 > to the destination database's encoding? A lot of people seem to believe that transcoding through utf8 is 100% safe. They're wrong :-( --- the Japanese, at least, have reason not to trust it, because of the existence of multiple incompatible conversion standards. And you're still left with the question of what to do when the destination encoding hasn't got the character. Moreover, this is all moderately expensive unless the encodings in question are already utf8 or latin1. So if we go this way I'd prefer to do it as I said above -- just drop or question-mark-ize any characters that don't pass validation in the recipient DB. That's fairly cheap and it will work perfectly in the typical case where the whole cluster is on one encoding anyway. regards, tom lane