Yes, it's normal for ispell dictionary, think about morphological dictionary. On Fri, May 2, 2014 at 11:54 AM, Tim van der Linden <tim@xxxxxxxxx> wrote: > Good morning/afternoon all > > I am currently writing a few articles about PostgreSQL's full text capabilities and have a question about the Ispell dictionary which I cannot seem to find an answer to. It is probably a very simple issue, so forgive my ignorance. > > In one article I am explaining about dictionaries and I have setup a sample configuration which maps most token categories to only use a Ispell dictionary (timusan_ispell) which has a default configuration: > > CREATE TEXT SEARCH DICTIONARY timusan_ispell ( > TEMPLATE = ispell, > DictFile = en_us, > AffFile = en_us, > StopWords = english > ); > > When I run a simple query like "SELECT to_tsvector('timusan-ispell','smiling')" I get back the following tsvector: > > 'smile':1 'smiling':1 > > As you can see I get two lexemes with the same pointer. > The question here is: why does this happen? > > Is it normal behavior for the Ispell dictionary to emit multiple lexemes for a single token? And if so, is this efficient? I mean, why could it not simply save one lexeme 'smile' which (same as the snowball dictionary) would match 'smiling' as well if later matched with the accompanying tsquery? > > Thanks! > > Cheers, > Tim > > > -- > Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general