Search Postgresql Archives

Compound words giving undesirable results with tsearch2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I've setup a database using tsearch2, configured with support for compound
words according to the excellent guide found here:

 http://www.sai.msu.su/~megera/oddmuse/index.cgi/Tsearch_V2_compound_words

This works fine. There is however one drawback that I'd like to know
whether can be remedied. Let's say I want to search for records containing
the word 'fritekst', which is a compound Norwegian word meaning
'free text'.

testdb=# select to_tsquery('default_norwegian', 'fritekst');
          to_tsquery
------------------------------
 'fritekst' | 'fri' & 'tekst'
(1 row)

Now, this will indeed match those records, but it will also match any
records containing both of the words 'fri' and 'tekst', without regard
to whether they are next to each other or in completely different parts
of the text being indexed. In many situations, this will lead to a lot
of 'false' matches, seen from a user perspective.

Ideas on how to handle this problem will be much appreciated.

-- 
Lars Haugseth

"If anyone disagrees with anything I say, I am quite prepared not only to
 retract it, but also to deny under oath that I ever said it." -Tom Lehrer


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux