Search Postgresql Archives

Re: case insensitive match in unicode

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Mar 27, 2006 at 12:45:05PM +0200, SunWuKung wrote:
> This sounds like a very interesting concept.
> It wouldn't be 'case insensitive' just insensitive.
> 
> The way I imagine it now is a special case of the ~ function.
> I create matchgroups in a table and check each character if it is in the 
> group. If it is I will replace the character with the group in [éÉE], 
> [oóOÓ??] and do a regexp with that.

No need to reinvent the wheel. ICU provides a range of services to deal
with this. For example the following filter in ICU:

 NFD; [:Nonspacing Mark:] Remove; NFC.

Will remove all accents from characters. And it works for all Unicode
characters. With a bit more thinking you can work with case variations
also.

There is also a locale-independant case-mapping module there plus
various locale specific ones also.

http://icu.sourceforge.net/userguide/Transform.html
http://icu.sourceforge.net/userguide/caseMappings.html
http://icu.sourceforge.net/userguide/normalization.html

Have a nice day,
-- 
Martijn van Oosterhout   <kleptog@xxxxxxxxx>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment: signature.asc
Description: Digital signature


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux