I have updated the encode() documentation to not mention "ASCII", and to be more specific about what 'escape' does. Backpatched to 8.2.X. --------------------------------------------------------------------------- Michael Fuhr wrote: > On Thu, Jan 25, 2007 at 02:28:38PM -0500, Michael Artz wrote: > > Perhaps my understanding of the 'encode' function is incorrect, but I > > was under the impression that I could do something like: > > > > SELECT lower(encode(bytes, 'escape')) FROM mytable; > > > > as it sounded like (from the manual) that 'encode' would return valid > > ASCII, with all the non-ascii bytes hex escaped. > > The documentation for encode() does give that impression: "Encode > binary string to ASCII-only representation. Supported types are: > base64, hex, escape." However, the source code for esc_encode() > in src/backend/utils/adt/encode.c says and does otherwise: > > * Only two characters are escaped: > * \0 (null) and \\ (backslash) > > > When I have the byte 0x8a, however, I get the error: > > > > ERROR: invalid byte sequence for encoding "UTF8": 0x8a > > Since encode() returns text and doesn't escape non-ASCII characters, > all of the original binary data will be treated as though it's text > in the database's encoding. If the data contains byte sequences > that aren't valid in that encoding then you get the above error. > > > I have the sneaking suspicion that I am missing something, so please > > correct me if I am wrong. If I am wrong, is there a better way to > > lowercase all the ascii characters in a bytea string? > > What are you trying to do? What is the binary data and why are you > treating it (or part of it) as though it's text? Do you want the > end result to be text with escape sequences or do you want to convert > it back to bytea? > > Something like this might work: > > SELECT lower(textin(byteaout(bytes))) FROM mytable; > > To turn the result back into bytea: > > SELECT decode(lower(textin(byteaout(bytes))), 'escape') FROM mytable; > > -- > Michael Fuhr > > ---------------------------(end of broadcast)--------------------------- > TIP 3: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq -- Bruce Momjian <bruce@xxxxxxxxxx> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Index: doc/src/sgml/func.sgml =================================================================== RCS file: /cvsroot/pgsql/doc/src/sgml/func.sgml,v retrieving revision 1.368 diff -c -c -r1.368 func.sgml *** doc/src/sgml/func.sgml 20 Feb 2007 18:18:05 -0000 1.368 --- doc/src/sgml/func.sgml 20 Feb 2007 19:48:37 -0000 *************** *** 1356,1363 **** </entry> <entry><type>text</type></entry> <entry> ! Encode binary data to <acronym>ASCII</acronym>-only representation. Supported types are: <literal>base64</>, <literal>hex</>, <literal>escape</>. </entry> <entry><literal>encode( E'123\\000\\001', 'base64')</literal></entry> <entry><literal>MTIzAAE=</literal></entry> --- 1356,1365 ---- </entry> <entry><type>text</type></entry> <entry> ! Encode binary data to different representation. Supported types are: <literal>base64</>, <literal>hex</>, <literal>escape</>. + <literal>Escape</> merely outputs null bytes as <literal>\000</> and + doubles backslashes. </entry> <entry><literal>encode( E'123\\000\\001', 'base64')</literal></entry> <entry><literal>MTIzAAE=</literal></entry>