Search Postgresql Archives

Indexing unknown words with Tsearch2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

First of all, excuse my poor english :)

I'm working on a fulltext database with tsearch2, which contains french historical writings. I'm using the fr_ispell dictionnary that can be found here : http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/ (ispell-french.tar.gz <http://www.sai.msu.su/%7Emegera/postgres/gist/tsearch/V2/dicts/ispell/ispell-french.tar.gz> - submitted by Max Jacob)
The database encoding is LATIN1

The problem is the writings contains many names of personnalities. For example : Churchill (the database covers WWII). But when I try to search for these names, nothing is found.

I tried many things, like this introduction : http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/docs/tsearch-V2-intro.html And I think the problem's root is that no lexem is found (I could even say an empty lexem is found).

With the default en_stem dictionnary, I get this :

SELECT lexize('en_stem', 'churchill');
"{churchil}"

Then, I try to add the french dictionnary :

INSERT INTO pg_ts_dict
              (SELECT 'fr_ispell',
                      dict_init,
                      'DictFile="/home/.../french.dict",'
                      'AffFile="/home/.../french.aff",'
                      'StopFile="/home/.../french.stop"',
                      dict_lexize
               FROM pg_ts_dict
               WHERE dict_name = 'ispell_template');

And the result is :

SELECT lexize('fr_ispell', 'churchill');
""

My questions are :
- Is it OK to give empty string as a result for a word that is not in the dictionnary, neither in the stop words ? - Is there a way to get the word itself as a result, when the word is not in the dictionnary, neither in the stop words ?
- If yes, how ?

I'm also interested in any information you could give me...
Many thanks !

Greg Maitrallain.

--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux