Search Postgresql Archives

Re: FTS with more than one language in body and with unknown query language?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On 14.07.2016 01:16, Stefan Keller wrote:
Hi,

I have a text corpus which contains either German or English docs and
I expect queries where I don't know if it's German or English. So I'd
like e.g. that a query "forest" matches "forest" in body_en but also
"Wald" in body_de.

I created a table with attributes body_en and body_de (type "text"). I
will use ts_vector/ts_query on the fly (don't need yet an index
(attributes)).

* Can FTS handle this multilingual situation?

In my opinion, PostgreSQL cant handle it. It cant translate words from one language to another, it just stems word from original form to basic form. First you need to translate word from English to German, then search word in the body_de attribute.

And the issue is complicated by the fact that one word could have different meaning in the other language.

* How to setup a text search configuration which e.g. stems en and de words?
* Should I create a synonym dictionary which contains word
translations en-de instead of synonyms en-en?

This synonym dictionary will contain a thousands entries. So it will require a great effort to make this dictionary.

* Any hints to related work where FTS has been used in a multilingual context?

:Stefan



--
Artur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux