Search Postgresql Archives

[tsvector] to_tsvector called multiple times

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi everybody,

the following stemming results made me curious:

select to_tsvector('german', 'systeme'); > 'system':1
select to_tsvector('german', 'systemes'); > 'system':1
select to_tsvector('german', 'systems'); > 'system':1
select to_tsvector('german', 'systemen'); > 'system':1
select to_tsvector('german', 'system'); >  'syst':1


First of all, this seems to be a bug in the German stemmer. Where can I fix it?

Second, and more importantly, as I understand it, the stemmed version of a word should be considered normalized. That is, all other versions of that stem should be mapped to it as well. The interesting problem here is that PostgreSQL maps the stem itself ('system') to a completely different stem ('syst').

Should a stem not remain stable even when to_tsvector is called on it multiple times?

--
Sven R. Kunze
TBZ-PARIV GmbH, Bernsdorfer Str. 210-212, 09126 Chemnitz
Tel: +49 (0)371 33714721, Fax: +49 (0)371 5347920
e-mail: srkunze@xxxxxxxxxxxx
web: www.tbz-pariv.de

Geschäftsführer: Dr. Reiner Wohlgemuth
Sitz der Gesellschaft: Chemnitz
Registergericht: Chemnitz HRB 8543



--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general





[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux