Search Postgresql Archives

Re: to_ascii, or some other form of magic transliteration

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hrm, I must be missing something, because I don't see how this will transliterate to ASCII?

On Sep 10, 2005, at 5:30 AM, Mike Rylander wrote:

On 9/9/05, Ben <bench@xxxxxxxxxxxxxxx> wrote:

I'm working on a problem that I imagine others have had, which basically boils down to having nice unicode display text that users are going to want to search against without typing it correctly.... e.g. let a search for "sma" match "små". It seems like the best way to do this is to find
a magic unicode transliteration mapping function, and then save the
ASCII transliterations for searching against.



The simplest solution to this that I've found is to maintain a
separate column for ASCII-ized version of your text.  The conversion
can be done automatically using a trigger, and I have one in PL/PERLU
that I use.  It basically boils down to:

1) transform unicode text to normal form D
2) strip combining non-spacing marks

In modern Perls that looks like:

#--------------
use Unicode::Normalize;
my $txt = NFD(shift());
$txt =~ s/\pM//og;
return $txt;
#--------------

Hope that helps!



---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
      choose an index scan if your joining column's datatypes do not
      match


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux