Re: Ways to speed up ts_rank

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/10/2012 06:38 AM, François Beausoleil wrote:
Notice the ts_rank() is on the outer query, which means it'll only
run on the subset of documents which match the query. This is
explicitly mentioned in the docs:

"""Ranking can be expensive since it requires consulting the tsvector
of each matching document, which can be I/O bound and therefore slow.
Unfortunately, it is almost impossible to avoid since practical
queries often result in large numbers of matches."""

(last paragraph of)
http://www.postgresql.org/docs/current/static/textsearch-controls.html#TEXTSEARCH-RANKING

Indeed, I have studied that paragraph in depth, trying to gather as much possible meaning from it as I can. :-)

However, the following two queries take exactly the same time, suggesting to me that ts_rank_cd is really only looking at matching rows, not all rows:

SELECT docid, coefficient * ts_rank_cd('{0.1, 0.2, 0.5, 1.0}',
              text_vector, to_tsquery('english', 'stuff')) AS rank
FROM pgtextindex
WHERE (text_vector @@ to_tsquery('english', 'stuff'))
ORDER BY rank DESC
limit 3;

SELECT docid, coefficient * ts_rank_cd('{0.1, 0.2, 0.5, 1.0}',
              text_vector, to_tsquery('english', 'stuff')) AS rank
FROM (SELECT * FROM pgtextindex
      WHERE (text_vector @@ to_tsquery('english', 'stuff'))) AS filtered
ORDER BY rank DESC
limit 3;

Thanks for the suggestion though. By the way, all the tsvectors are already loaded into the kernel cache when I execute the queries, so ranking large documents is in fact CPU bound rather than I/O bound. The CPU is pegged for the whole time.

Shane



--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux