Search Postgresql Archives

Re: full text search: the concept of a "word"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



My textfields are trigger-generated using information from a number of
tables: these fields can be, say, a couple of thousand characters
wide.
Up to here, there's no problem.
What I'd like to do is define - possibly using regexps - what
constitutes a word. For instance, my word separator is a semicolon,
not a space; a dash is not a separator, and neither are language
specific characters (which might be interpreted that way by a language
agnostic tool)...
BTW, I use UTF-8 as my database encoding if it's of any importance.

I do not see a big problem: just write your own parser.

It's may be a problem with UTF-8: only CHS head tsearch2 supports UTF-8. But you can find a patch on 8.1 at http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/




--
Teodor Sigaev                                   E-mail: teodor@xxxxxxxxx
                                                   WWW: http://www.sigaev.ru/


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux