On Mon, Mar 27, 2006 at 12:45:05PM +0200, SunWuKung wrote: > This sounds like a very interesting concept. > It wouldn't be 'case insensitive' just insensitive. > > The way I imagine it now is a special case of the ~ function. > I create matchgroups in a table and check each character if it is in the > group. If it is I will replace the character with the group in [éÉE], > [oóOÓ??] and do a regexp with that. No need to reinvent the wheel. ICU provides a range of services to deal with this. For example the following filter in ICU: NFD; [:Nonspacing Mark:] Remove; NFC. Will remove all accents from characters. And it works for all Unicode characters. With a bit more thinking you can work with case variations also. There is also a locale-independant case-mapping module there plus various locale specific ones also. http://icu.sourceforge.net/userguide/Transform.html http://icu.sourceforge.net/userguide/caseMappings.html http://icu.sourceforge.net/userguide/normalization.html Have a nice day, -- Martijn van Oosterhout <kleptog@xxxxxxxxx> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Attachment:
signature.asc
Description: Digital signature