Search Postgresql Archives

Re: Database-based alternatives to tsearch2?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2006-12-12 at 12:19 -0600, Wes wrote:
> I'm looking for a non index-based full text indexing - one that stores the
> information as table data instead of index data.  I do not need to implement
> SQL operators for searches.  The application library would need to implement
> the actual word search.
> 

Store the tsvector (a custom type provided by tsearch2) as a separate
column in the table. This data type holds all the important information
about the indexed text, such as distinct words and some position
information, but it takes up much less space than a large document.

The tsearch2 package provides a lot of functionality even without the
index. But after you have a tsvector column, you can create an index on
it if you want.

> Indexes are too fragile.  Our documents will be offline, and re-indexing
> would be impossible.  Additionally, as I undertstand it, tsearch2 doesn't
> scale to the numbers I need (hundreds of millions of documents).
> 

Try PostgreSQL 8.2 with tsearch2 using GIN. As I understand it, that's
very scalable.

Also, as I understand it, a GIN index should not need to be reindexed
unless there is a huge shift in the set of distinct words you're using.
However, if you do need to reindex, you can if you have the tsvector
column. 

Regards,
	Jeff Davis





[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux