Search Postgresql Archives

Re: TSearch2 / German compound words / UTF-8

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Alexander,

could you try tsearch2 from CVS HEAD  ?
tsearch2 in 8.1.X doesn't supports UTF-8 and works for someone
only by accident :)

	Oleg
On Fri, 27 Jan 2006, Alexander Presber wrote:

Tsearch/isepll is not able to break this word into parts, because of the "s" in "Produktion/s/intervall". Misspelling the word as "Produktionintervall" fixes it:
It should be affixes marked as 'affix in middle of compound word',
Flag is '~', example look in norsk dictionary:

flag ~\\:
   [^S]           >        S              #~ advarsel > advarsels-

BTW, we develop and debug compound word support on norsk (norwegian) dictionary, so look for example there. But we don't know Norwegian, norwegians helped us :)

Hello everyone!

I cannot get this to work. Neither in a german version, nor with the norwegian example supplied on the tsearch website. That means, just like Hannes I can get compound word support without inserted 's' in german and norwegian:
"Vertragstrafe" works, but not "Vertragsstrafe", which is the right Form.

So I tried it the other way around: My dictionary consists of two words:

---
vertrag/zs
strafe/z
---

My affixes file just switches on compounds and allows for s-insertion as described in the norwegian tutorial:

---
compoundwords controlled z
suffixes
flag s:
[^S] > S # endet nicht auf "s": "s" anfuegen und in compound-check ("Recht" > "Rechts-")
---

ts_debug yields:

tstest=# SELECT tsearch2.ts_debug('vertragstrafe strafevertrag vertragsstrafe');
                                    ts_debug
-------------------------------------------------------------------------------------
(german,lword,"Latin word",vertragstrafe,"{ispell_de,simple}","'strafe' 'vertrag'") (german,lword,"Latin word",strafevertrag,"{ispell_de,simple}","'strafe' 'vertrag'") (german,lword,"Latin word",vertragsstrafe,"{ispell_de,simple}",'vertragsstrafe')
(3 Zeilen)

I would say, the ispell compound support does not honor the s-Flag in compounds. Could it be, that this feature got lost in a regression? It must have worked for norwegian once. (Take the "overtrekksgrilldresser" example from the tsearch2:compounds tutorial, that I cannot reproduce).

Any hints?

Alexander

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
    choose an index scan if your joining column's datatypes do not
    match

	Regards,
		Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@xxxxxxxxxx, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux