Re: text and bytea

"hernan gonzalez" <hgonzalez@xxxxxxxxx> · Mon, 25 Feb 2008 14:52:17 -0200

Another example  (Psotgresql 8.3.0, UTF-8  server/client encoding)

test=# create table chartest ( c text);
test=# insert into chartest (c) values ('¡Hasta mañana!');
test=# create view vchartest as
 select encode(convert_to(c,'LATIN9'),'escape') as c1 from chartest;

test=# select c,octet_length(c) from chartest ;
       c        | octet_length
----------------+--------------
 ¡Hasta mañana! |           16

test=# select c1,octet_length(c1) from vchartest ;
      c1      | octet_length
--------------+--------------
 Hasta maana! |           14

(the field is seen as as text by postgresql, with the default
encoding.. UTF8; it is actually not)

test=# select * from vchartest where c1 like '%a%';
      c1
--------------
 Hasta maana!
(1 row)

test=# select * from vchartest where c1 ilike '%a%';
ERROR:  invalid byte sequence for encoding "UTF8": 0xa1
HINT:  This error can also happen if the byte sequence does not match
the encoding expected by the server, which is controlled by
"client_encoding".

That "ilike" breaks and "like" is rather random, it seems that the
later has some optimization does not check the validty of the utf8
stream. But thats not the point. The point is that IMO postgresql
should always handle text in the backend encoding, there should no
exists funcions that are designed to produce/consume texts in other
encodings. Perhaps the "encode" function is ill defined, and should be
rethinked. Two alternatives:
1. For special binary-to-ascii encodings (base64,hex). Keep its
present signature but remove/deprecate the "escape" type. It returns a
text in the backend encoding.
2  For arbitrary binary encodings. Change its signature so that it
returns bytea.
Of course, all this applies symmetrically to decode().

Appart, and in accordance with this, I think to_ascii() should accept
only one text argument.

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
       subscribe-nomail command to majordomo@xxxxxxxxxxxxxx so that your
       message can get through to the mailing list cleanly