Mohamed,
We are looking on the problem.
Oleg
On Mon, 2 Feb 2009, Mohamed wrote:
No, I don't. But the ts_lexize don't return anything so I figured there must
be an error somehow.
I think we are using the same dictionary + that I am using the stopwords
file and a different affix file, because using the hunspell (ayaspell) .aff
gives me this error :
ERROR: wrong affix file format for flag
CONTEXT: line 42 of configuration file "C:/Program
Files/PostgreSQL/8.3/share/tsearch_data/hunarabic.affix": "PFX Aa Y 40
/ Moe
On Mon, Feb 2, 2009 at 12:13 PM, Daniel Chiaramello <
daniel.chiaramello@xxxxxxxxx> wrote:
Hi Mohamed.
I don't know where you get the dictionary - I unsuccessfully tried the
OpenOffice one by myself (the Ayaspell one), and I had no arabic stopwords
file.
Renaming the file is supposed to be enough (I did it successfully for
Thailandese dictionary) - the ".aff'" file becoming the ".affix" one.
When I tried to create the dictionary:
CREATE TEXT SEARCH DICTIONARY ar_ispell (
TEMPLATE = ispell,
DictFile = ar_utf8,
AffFile = ar_utf8,
StopWords = english
);
I had an error:
ERREUR: mauvais format de fichier affixe pour le drapeau
CONTEXTE : ligne 42 du fichier de configuration ?
/usr/share/pgsql/tsearch_data/ar_utf8.affix ? : ? PFX Aa Y 40
(which means Bad format of Affix file for flag, line 42 of configuration
file)
Do you have an error when creating your dictionary?
Daniel
Mohamed a ?crit :
I have ran into some problems here.
I am trying to implement arabic fulltext search on three columns.
To create a dictionary I have a hunspell dictionary and and arabic stop
file.
CREATE TEXT SEARCH DICTIONARY hunspell_dic (
TEMPLATE = ispell,
DictFile = hunarabic,
AffFile = hunarabic,
StopWords = arabic
);
1) The problem is that the hunspell contains a .dic and a .aff file but
the configuration requeries a .dict and .affix file. I have tried to change
the endings but with no success.
2) ts_lexize('hunspell_dic', 'ARABIC WORD') returns nothing
3) How can I convert my .dic and .aff to valid .dict and .affix ?
4) I have read that when using dictionaries, if a word is not recognized by
any dictionary it will not be indexed. I find that troublesome. I would like
everything but the stop words to be indexed. I guess this might be a step
that I am not ready for yet, but just wanted to put it out there.
Also I would like to know how the process of the fulltext search
implementation looks like, from config to search.
Create dictionary, then a text configuration, add dic to configuration,
index columns with gin or gist ...
How does a search look like? Does it match against the gin/gist index.
Have that index been built up using the dictionary/configuration, or is the
dictionary only used on search frases?
/ Moe
Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@xxxxxxxxxx, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83
--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general