Jasen Betts <jasen@xxxxxxxxxx> wrote: >On 2013-02-23, Raymond C. Rodgers <sinful622@xxxxxxxxx> wrote: >> On 02/23/2013 05:26 AM, Tom Lane wrote: >>> A "virtual" tsvector like that is probably going to be useless for >>> searching as soon as you get a meaningful amount of data, because the >>> only way the DB can implement a search is to compute the tsvector >>> value for each table row and then examine it for the target word(s). >>> What you want is a GIST or GIN index on the contents of the tsvector. > >> I think the only real advantage to using something like this would be a >> space savings in terms of storing the tsvector data, but I don't see >> that being a significant enough reason to go ahead and use this idea in >> a production situation. As mentioned [by pretty much all of us], once >> the table size is sufficiently large there would be a performance >> penalty by to_tsvector being executed on every record in the table. > >Unless the plan comes out as a table scan the index will be used >instead ot to_tsvector() > >When there is a table scan to_tsvector will be used instead of reading >from disk, I don't know how fast to_tsvector is compared to disk, but >usually computing a result is faster than reading it from disk. > >Storing the tsvector in the table is likely to be faster only when a >tablescan is done and the table is fully cached in ram. I guess I was being dumb in assuming that it was obvious that a GIN or GiST index would be needed for decent performance at scale. Without that, a scan of the whole table (or at least all rows matching other search criteria) is needed, which is going to hurt. The benchmarks I mentioned were for a GIN index on the results of the function which generated the tsvector, versus a GIN index on the stored tsvector. In our case, a typical scan for document text against years of accumulated court documents was about 300 ms versus about 1.5 seconds. It may matter that we weren't just looking for matches, but the top K matches based on the ranking function. -- Kevin Grittner EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general