On Mon, 2006-10-23 at 10:26 +0200, Albe Laurenz wrote: > Jeff Davis wrote: > > I have a UTF8 encoded database. I can do > > > > => SELECT '\xb9'::text; > > > > But that seems to be the only way to get an invalid utf8 byte sequence > > into a text type. > [...] > > So, if I were to sum this up in a single question, why does cstring > not > > accept invalid utf8 sequences? And if it doesn't, why are they allowed > > in any text type? > > I would say that it should be impossible to get invalid UTF-8 bytes > into a text on an UTF-8 database, and my opinion is that it is a bug or > oversight if a typecast allows you to do so. That wouldn't help me, but it seems like more consistent behavior. > The program you are talking about that needs to be able to store > arbitrary bytes in a text column should be changed - maybe it is enough > to change the data type of the database column from 'text' to 'bytea'. > The problem is that all the bytes in the quoted string are converted to a cstring first, which rejects invalid UTF8 sequences. So, even if it's bytea type, the query itself can't contain the bytes I want to store. The only way bytea would work is using PQexecParams and setting the type to bytea and the format to binary. I agree that's the more robust way for the application to be written, but unfortunately that's not how it was written. Regards, Jeff Davis