On 12/29/2014 4:36 PM, Mike Cardwell wrote:
I'd like to store hostnames in a postgres database and I want to fully support IDNs (Internationalised Domain Names) I want to be able to recover the original representation of the hostname, so I can't just encode it with punycode and then store the ascii result. For example, these two are the same hostnames thanks to unicode case folding [1]: tesst.ëxämplé.com teßt.ëxämplé.com They both encode in punycode to the same thing: xn--tesst.xmpl.com-cib7f2a Don't believe me, then try visiting any domain with two s's in, whilst replacing the s's with ß's. E.g: ericßon.com nißan.com americanexpreß.com So if I pull out "xn--tesst.xmpl.com-cib7f2a" from the database, I've no idea which of those two hostnames was the original representation. The trouble is, if I store the unicode representation of a hostname instead, then when I run queries with conditions like: WHERE hostname='nißan.com'
_IF_ Postgres had a punycode function, then you could use: WHERE punycode(hostname) = punycode('nißan.com') -Andy -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general