Search Postgresql Archives

Re: verifying unicode locale support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Apr 13, 2004 at 12:32:17PM -0400, Tom Lane wrote:
> Holger Klawitter <lists@klawitter.de> writes:
> > In order to avoid interaction with gcc, cat and others else I've written a
> > new program, reading from a file.
> 
> After setting up the test case and duplicating your problem, I realized
> I was being dense :-( ... this is a well-known issue.  Need more
> caffeine before answering bug reports obviously ...
> 
> The problem is that PG's upper() and lower() functions are based on
> the C library's <ctype.h> functions (toupper() and tolower()), which of
> course only work for single-byte character sets.  So they cannot work on
> UTF8 data.
> 
> There has been some talk of rewriting these functions to use the
> <wctype.h> API where available, but no one's actually stepped up to the
> plate and done it.  IIRC the main sticking point was figuring out how to
> get from whatever character encoding the database is using into the wide
> character set representation the C library wants.  There doesn't seem to
> be a portable way of discovering exactly what the wchar encoding is
> supposed to be for the current locale setting.

 There  is  the  "libcharset  - portable  character  set  determination.
 library". But maintain  this library with  a lot  of OS depend  code is
 probably nothing simple. It's used in standard iconv.

 http://www.haible.de/bruno/packages-libcharset.html

 But  I'm  not sure  if  it  resolve  something,  because there  is  not
 gaurantee  of any  connection between  the current  locale setting  and
 string encoding.
 
     SELECT upper( convert('foo', 'X', 'Y') );

 IMHO solution  is add  to "struct varlena"  pointer to  pg_encname that
 knows handle  PostgreSQL encoding information and  make each PostgreSQL
 string  independent and  self-described. Or is  there something  why is
 this useless?

    Karel

-- 
 Karel Zak  <zakkr@zf.jcu.cz>
 http://home.zf.jcu.cz/~zakkr/

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
      subscribe-nomail command to majordomo@postgresql.org so that your
      message can get through to the mailing list cleanly

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux