but does 'dursten' is a some form of 'durst' ?
Yes it is.
Hm, even when I remove `dursten' and `durst' all together from the dict
I still get `sen'.
How can I update a tsvector column stripping the `sen' lexem?
Thanks!
On 03.08.2006 12:54, Oleg Bartunov wrote:
Hannes,
I don't know german, sorry, but does 'dursten' is a some form of 'durst' ?
Probably, here we have false hit from compound word support. I'd suggest
to use exclusion dictionary (on the base of synonym dictionary) before
ispell. It could be very simple:
durst : durst
Oleg
On Thu, 3 Aug 2006, Hannes Dorbath wrote:
SELECT ts_debug('durst');
(default_german,lword,"Latin word",durst,"{de_ispell,de}","'dur' 'sen'")
SELECT ts_debug('h?chsten');
(default_german,word,Word,h?chsten,"{de_ispell,de}","'sen' 'h?ch'
'h?chst' 'h?chsten'")
For some reason both produce the lexem 'sen'. That leads to strange
results. Search for `durst' will highlight `h?chsten' with headline().
Server is PG 8.0.4,
german snowball stemmer,
dictionary used is http://hannes.imos.net/german_iso.med
(From OpenOffice)
What causes some words to result in `sen', though they don't contain
that lexem?
Thanks!
Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@xxxxxxxxxx, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83
---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?
http://www.postgresql.org/docs/faq
--
Regards,
Hannes Dorbath