Search Postgresql Archives

Full text: Ispell dictionary

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Good morning/afternoon all

I am currently writing a few articles about PostgreSQL's full text capabilities and have a question about the Ispell dictionary which I cannot seem to find an answer to. It is probably a very simple issue, so forgive my ignorance.

In one article I am explaining about dictionaries and I have setup a sample configuration which maps most token categories to only use a Ispell dictionary (timusan_ispell) which has a default configuration: 

CREATE TEXT SEARCH DICTIONARY timusan_ispell (
	TEMPLATE = ispell,
	DictFile = en_us,
	AffFile = en_us,
	StopWords = english
);

When I run a simple query like "SELECT to_tsvector('timusan-ispell','smiling')" I get back the following tsvector:

'smile':1 'smiling':1

As you can see I get two lexemes with the same pointer.
The question here is: why does this happen? 

Is it normal behavior for the Ispell dictionary to emit multiple lexemes for a single token? And if so, is this efficient? I mean, why could it not simply save one lexeme 'smile' which (same as the snowball dictionary) would match 'smiling' as well if later matched with the accompanying tsquery?

Thanks!

Cheers,
Tim


-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux