Re: Full text search - query plan? PG 8.4.1

Jesper Krogh <jesper@xxxxxxxx> · Sun, 18 Oct 2009 20:55:49 +0200

Tom Lane wrote:
> Jesper Krogh <jesper@xxxxxxxx> writes:
>> "commonterm" matches 37K of the 50K documents (majority), but the query
>> plan is "odd" in my eyes.
> 
>> * Why does it mis-guess the cost of a Seq Scan on textbody so much?
> 
> The cost looks about right to me.  The cost units are not milliseconds.
> 
>> * Why doesn't it use the index in "id" to fetch the 10 records?
> 
> You haven't *got* an index on id, according to the \d output.

Thanks (/me bangs my head against the table). I somehow assumed that "id
SERIAL" automatically created it for me. Even enough to not looking for
it to confirm.

> The only part of your results that looks odd to me is the very high cost
> estimate for the bitmapscan:
> 
>>          ->  Bitmap Heap Scan on textbody  (cost=267377.23..269147.80
>> rows=36771 width=5) (actual time=15.763..30.576 rows=37133 loops=1)
>>                Recheck Cond: (textbody_body_fts @@
>> to_tsquery('commonterm'::text))
>>                ->  Bitmap Index Scan on textbody_tfs_idx
>> (cost=0.00..267368.04 rows=36771 width=0) (actual time=15.419..15.419
>> rows=37134 loops=1)
>>                      Index Cond: (textbody_body_fts @@
>> to_tsquery('commonterm'::text))
> 
> When I try this with a 64K-row table having 'commonterm' in half of the
> rows, what I get is estimates of 1530 cost units for the seqscan and
> 1405 for the bitmapscan (so it prefers the latter).  It will switch over
> to using an index on id if I add one, but that's not the point at the
> moment.  There's something strange about your tsvector index.  Maybe
> it's really huge because the documents are huge?

huge is a relative term, but length(ts_vector(body)) is about 200 for
each document. Is that huge? I can postprocess them a bit to get it down
and will eventually do that before going to "production".

Thanks alot.

-- 
Jesper

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance