Hi Oleg and all, On Wed, Jul 13, 2011 at 08:16, Oleg Bartunov <oleg@xxxxxxxxxx> wrote: > there is problem with estimating of cost scanning gin index in < 9.1 > versions, > so you can set enable_seqscan=off; > or try 9.1 which beta3 now. I re-ran my queries using enable seqscan=off. Now the first query, without ts_rank, uses the GIN index: set enable_seqscan=off; explain analyze select id from posts_1000000 where to_tsquery('english', 'crare') @@ document_vector limit 50; Limit (cost=42290.12..42306.31 rows=50 width=4) (actual time=16.259..16.412 rows=50 loops=1) -> Bitmap Heap Scan on posts_1000000 (cost=42290.12..57877.02 rows=48152 width=4) (actual time=16.256..16.344 rows=50 loops=1) Recheck Cond: ('''crare'''::tsquery @@ document_vector) -> Bitmap Index Scan on index_posts_documents_1000000 (cost=0.00..42278.08 rows=48152 width=0) (actual time=13.265..13.265 rows=49951 loops=1) Index Cond: ('''crare'''::tsquery @@ document_vector) Total runtime: 16.484 ms But the second query, the one that uses ts_rank, is still very slow... Any idea why? Is ts_rank efficient enough to find the best 50 matches among 50 000 documents? set enable_seqscan=off; explain analyze select id from posts_1000000 where to_tsquery('english', 'crare') @@ document_vector order by ts_rank_cd(document_vector, to_tsquery('english', 'crare'), 32) desc limit 50; Limit (cost=59596.98..59597.10 rows=50 width=22) (actual time=296212.052..296212.257 rows=50 loops=1) -> Sort (cost=59596.98..59717.36 rows=48152 width=22) (actual time=296186.928..296187.007 rows=50 loops=1)" Sort Key: (ts_rank_cd(document_vector, '''crare'''::tsquery, 32)) Sort Method: top-N heapsort Memory: 27kB -> Bitmap Heap Scan on posts_1000000 (cost=42290.12..57997.40 rows=48152 width=22) (actual time=70.861..296059.515 rows=49951 loops=1) Recheck Cond: ('''crare'''::tsquery @@ document_vector) -> Bitmap Index Scan on index_posts_documents_1000000 (cost=0.00..42278.08 rows=48152 width=0) (actual time=24.922..24.922 rows=49951 loops=1) Index Cond: ('''crare'''::tsquery @@ document_vector) Total runtime: 296220.493 ms >> By the way, does ts_rank is supposed to use a GIN index when it's >> available? > > no, I see no benefit :) Ok. But what is the solution to improve ts_rank execution time? Am I doing something wrong? Thanks for your help, Nicolas -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general