Re: Understanding tsearch2 performance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ivan,

here is explain analyze output - 7122 out of 528155 docs
tseval=# select count(*) from document;
count --------
 528155
(1 row)

Time: 345,562 ms

tseval=# explain analyze select docno, title from document where vector @@ to_tsquery('english','mars');
 Bitmap Heap Scan on document  (cost=1655.97..10518.34 rows=2641 width=13) (actual time=3.127..11.556 rows=7122 loops=1)
   Recheck Cond: (vector @@ '''mar'''::tsquery)
   ->  Bitmap Index Scan on idx_vector  (cost=0.00..1655.31 rows=2641 width=0) (actual time=1.899..1.899 rows=7122 loops=1)
         Index Cond: (vector @@ '''mar'''::tsquery)
 Total runtime: 12.303 ms
(5 rows)

This is PostgreSQL 8.4.4 on Ubuntu machine.


Oleg

On Wed, 14 Jul 2010, Ivan Voras wrote:

On 07/14/10 16:03, Kevin Grittner wrote:
Ivan Voras <  ivoras@xxxxxxxxxxx  > wrote:
On 07/14/10 15:49, Stephen Frost wrote:

Regarding the statistics, it's entirely possible that the index
is *not* the fastest way to pull this data (it's nearly 10% of
the table..)

I think that what I'm asking here is: is it reasonable for
tsearch2 to extract 8,500 rows from an index of 90,000 rows in 118
ms, given that the approximately same task can be done with an
unindexed "LIKE" operator in nearly the same time?

The answer is "yes."  When it's 10% of the table, a sequential scan
can be more efficient than an index, as Stephen indicated.

Ok, to verify this I've tried increasing statistics on the field and
running vacumm analyze full, which didn't help. Next, I've tried setting
enable_indexscan to off, which also didn't do it:

cms=> set enable_indexscan=off;
SET
cms=> explain analyze select id,title from forum where _fts_ @@
'fer'::tsquery order by id limit 10;
                                                            QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------
Limit  (cost=363.18..363.20 rows=10 width=35) (actual
time=192.243..192.406 rows=10 loops=1)
  ->  Sort  (cost=363.18..363.40 rows=91 width=35) (actual
time=192.229..192.283 rows=10 loops=1)
        Sort Key: id
        Sort Method:  top-N heapsort  Memory: 25kB
        ->  Bitmap Heap Scan on forum  (cost=29.21..361.21 rows=91
width=35) (actual time=12.071..136.130 rows=8449 loops=1)
              Recheck Cond: (_fts_ @@ '''fer'''::tsquery)
              ->  Bitmap Index Scan on forum_fts  (cost=0.00..29.19
rows=91 width=0) (actual time=11.169..11.169 rows=8449 loops=1)
                    Index Cond: (_fts_ @@ '''fer'''::tsquery)
Total runtime: 192.686 ms
(9 rows)

Any ideas on how to verify this?






	Regards,
		Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@xxxxxxxxxx, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux