Search Postgresql Archives

Re: Please comment on the following OpenFTS/tsearch2 issues!

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



1. While tsearch2 provides fairly complete boolean search expression support
with AND - &, OR - |, NOT - !, and grouping - (), OpenFTS appears to only
have support for ANDing search terms. Is there some reason it hasn't been
extended to support full tsearch2 search expressions? Has anyone modified
OpenFTS to do this?

Historical and simplification. No more.
We didn't modify OpenFTS... People often asks us about conversation text -> tsquery, so, in 8.2 will be plainto_tsquery() returning the same result as OpenFTS query parser.



2. Neither OpenFTS or tsearch2 support exact phrase matching. I've seen the
workaround to support matching a single exact phrase by modifying the WHERE
clause with textcolumn ~* "exact phrase". Does this give reasonable
performance? Has anyone implemented exact phrase matching in complex search
expressions like ("exact phrase1" AND term1) OR (NOT "exact phrase2" AND
"exact phrase3") ?

We didn't plan to develop phrase search unless we have clean idea to support complex query and compound words, look discussion at http://www.pgsql.ru/db/mw/msg.html?mid=2111601


3. The following summarizes what I've read about performance and scalability
of OpenFTS and/or tsearch2:

a) don't expect OpenFTS/tsearch2 to perform/scale as well as dedicated
search engines like Lucene, http://lucene.apache.org/,
http://archives.postgresql.org/pgsql-general/2002-05/msg01156.php.
Yes, GiST index is good for online update, but has problem with big sets.
We plan to add to 8.2 inverted index with which tsearch2 will work with comparable speed with Lucene...

First version was already published, look for announce :)


b) OR queries are slower than AND queries,
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/docs/oscon_tsearch2/o
ptimization.html.

Yes

Do you agree with this summary? If you are using either OpenFTS or tsearch2
in production, has the performance been acceptable? For my application I
could be looking at several million documents averaging about 3 pages each
(I only have ballpark figures at present).

We knows installation of tsearch2 working with 4 millions docs.


4. If you are using either OpenFTS or tsearch2 in production why did you
choose OpenFTS over tsearch2 or vice versa? One of the advantages of
tsearch2 that I can see is that, once you have setup your database and
indexed your documents, you can talk to the database directly from your
application using SQL without needing to go through Perl first. This assumes
that you're ok with tsearch2 search expression syntax so you can use
functions like to_tsquery. It also assumes that you don't need sophisticated
exact phrase matching.

OpenFTS may work on another box than pgsql, OpenFTS may index file directly from file system.


5. Are there any scripts, tools, add-ons, etc. that you can recommend?

We can tweak OpenFTS/tsearch2 for you.

--
Teodor Sigaev                                   E-mail: teodor@xxxxxxxxx
                                                   WWW: http://www.sigaev.ru/


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux