On Tue, 2006-12-12 at 12:19 -0600, Wes wrote: > I'm looking for a non index-based full text indexing - one that stores the > information as table data instead of index data. I do not need to implement > SQL operators for searches. The application library would need to implement > the actual word search. > Store the tsvector (a custom type provided by tsearch2) as a separate column in the table. This data type holds all the important information about the indexed text, such as distinct words and some position information, but it takes up much less space than a large document. The tsearch2 package provides a lot of functionality even without the index. But after you have a tsvector column, you can create an index on it if you want. > Indexes are too fragile. Our documents will be offline, and re-indexing > would be impossible. Additionally, as I undertstand it, tsearch2 doesn't > scale to the numbers I need (hundreds of millions of documents). > Try PostgreSQL 8.2 with tsearch2 using GIN. As I understand it, that's very scalable. Also, as I understand it, a GIN index should not need to be reindexed unless there is a huge shift in the set of distinct words you're using. However, if you do need to reindex, you can if you have the tsvector column. Regards, Jeff Davis