I'm thinking that the answer is in the literal interpretation of the
error message, i.e. it doesn't like the specific byte 0x00, i.e. the
null byte. According to the docs (4.1.2.1. String Constants):
"The character with the code zero cannot be in a string constant."
The reason may be that these are handled by C under the hood, so that
sequence would terminate the string and there shouldn't be anything
following it.
So the question then becomes, how to insert binary data this way? I'm
not sure about that off-hand.
-- Andy
On May 5, 2008, at 11:07 AM, Lee Feigenbaum wrote:
I had thought -- apparently erroneously -- that because this is not
a text based column, that I could send any string of bytes (octets)
via my INSERT statement to populate values in this column. I'm
using escaped string literals with hexadecimal representation so my
INSERTs look something like:
INSERT INTO myTable VALUES (..., E'\x15\x1C\x2F\x00\x02...', ...) ;
As you might be able to guess, I'm getting the error:
ERROR: Invalid byte sequence for encoding "UTF8": 0x00
(I get the error whether I attempt this via JDBC or via the command-
line client with client encoding set to UTF8 or WIN1252.)
Again, I was surprised by this error since I thought from the
documentation at [2] that the server would only expect to be
dealing in a sequence of octets here, without any character-
encoding constraints implied by the DB's encoding.
What is the actual cause of this error, and how do I workaround it?
Do I need to pretend that my data is Unicode character data and
specify the UTF8 octets for that character data in my E'...' literal?
thanks in advance for any help!
Lee
PS [3]
[1] Actually, this DDL has been converted from that for a different
DB that uses LONGVARBINARY for this. BYTEA was my best guess for
the Postgresql equivalent.
[2] http://www.postgresql.org/docs/8.3/interactive/datatype-
binary.html
[3] I also was confused as to why 0x00 would be an invalid UTF8
byte sequence. On its own, as I understand it, 0x00 is a fine UTF8
byte sequence (representing Unicode codepoint 0). And when I (from
the command line) try to insert other invalid UTF8 sequences --
such as INSERT INTO foo VALUES (E'\xC0\x80') I get an error that
mentions the full byte sequence as invalid: "invalid byte sequence
for encoding "UTF8": 0xc080". So this further confuses me. :-)
--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general