Ok, thank you all for your help. It has been very valuable. I am starting to get the hang of it and almost read the whole chapter 12 + extras but I still need a little bit of guidance.
I have now these files :
- A arabic Hunspell rar file (OpenOffice version) wich includes :
- ar.dic
- ar.aff
- An Aspell rar file that includes alot of files
- A Myspell ( says simple words list )
- And also Andrews two files :
- ar.affix
- ar.stop
I am thinking that I should go with just one of these right and that should be the Hunspell? There is an ar.aff file there and Andrews file ends with .affix, are those perhaps similiar? Should I skip Andrews ? Use just the ar.stop file ?
On the Arabic / English on row basis language search approach, I will skip and choose the approach suggested by Oleg :
if arabic and english characters are not overlaped, you can use one index.
The Arabic letters and English letters or words don't overlap so that should not be an issue? Will I be able to index and search against both languages in the same query?
And also
- What language files should I use ?
- How does my create dictionary for the arabic language look like ? Perhaps like this :
CREATE TEXT SEARCH DICTIONARY arabic_dic(
TEMPLATE = ? , // Not sure what this means
DictFile = ar, // referring to ar.dic (hunspell)
AffFile = ar , // referring to ar.aff (hunspell)
StopWords = ar // referring to Andrews stop file. ( what about Andrews .affix file ? )
// Anything more ?
);
Thanks again! / Moe