On Fri, Dec 2, 2011 at 8:16 AM, Torsten Zuehlsdorff <foo@xxxxxxxxxxxxxxxxxxx> wrote: > Damien Churchill schrieb: > > >>> after several attempts I have finally succeeded in developing a >>> urlencode() >>> function to encode text correctly like defined in RFC 1738. >>> >>> Now i have a big problem: how to decode the text? >>> >>> Example: >>> # SELECT urlencode('Hellö World!'); >>> urlencode >>> ----------------------- >>> Hell%C3%B6%20World%21 >>> >>> Does anybody know a way to convert '%21' back to '!' and '%C3%B6' to 'ö'? >> >> >> I've extracted the unquote method [0] from urllib in the python stdlib >> that decodes urlencoded strings. Hopefully be some use! > > > Not directly, but it gives me some helpful hints. For example i'm now able > to decode some basic characters, for example: > > # SELECT chr(x'21'::int); > chr > ----- > ! > (1 row) > > But i clearly have a missunderstanding of other chars, like umlauts or utf-8 > chars. This, for example, should return a 'ö': > > # SELECT chr(x'C3B6'::int); > chr > ----- > 쎶 > (1 row) > > Also i'm not sure how to figure out, when to decode '%C3' and when to decode > '%C3%B6'. > > Thanks for your help, You're welcome. get ready for some seriously abusive sql: create or replace function unencode(text) returns text as $$ with q as ( select (regexp_matches($1, '(%..|.)', 'g'))[1] as v ) select string_agg(case when length(v) = 3 then chr(replace(v, '%', 'x')::bit(8)::int) else v end, '') from q; $$ language sql immutable; set client_encoding to latin1; SET postgres=# select unencode('Hell%C3%B6%20World%21'); unencode --------------- Hellö World! (1 row) Time: 1.908 ms (maybe this isn't really an immutable function, but oh well). merlin -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general