Search Postgresql Archives

Re: How to switch off Snowball stemmer for tsearch2?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> Now
>
> select lexize('ru_ispell_cp1251', 'Дмитриев') -> "Дмитрий"
> select lexize('ru_ispell_cp1251', 'Иванов') -> "Иван"
> - it is completely wrong!
>
> I have a database with all Russian name, is it possible to use it (how?) to

if you have such database why just don't write special dictionary and
put it in front ?

Of course because this is a database of Russian NAMES, but NOT a database of surnames.


> make lexize() not to convert "Ivanov" to "Ivan" even if the ispell
> dicrionary contains an element for "Ivan"? So, this pseudo-code logic is
> needed:
>
> function new_lexize($string) {
>  $stem = lexize('ru_ispell_cp1251', $string);
>  if ($stem in names_database) return $string; else return $stem;
> }
>
> Maybe tsearch2 implements this logic already?

sure, it's how text search mapping works.

Could you please detalize?

Of course I can create all word-forms of all Russian names using ispell and then - subtract this full list from Ispell dictionary (so I will remove "Ivan", "Ivanami" etc. from it). But possily tsearch2 has this subtraction algorythm already.
 
Dmitry, seems your company could be my client :)

Not now, thank you. Maybe later.



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux