Tom Lane wrote: > Jesper Krogh <jesper@xxxxxxxx> writes: >> "commonterm" matches 37K of the 50K documents (majority), but the query >> plan is "odd" in my eyes. > >> * Why does it mis-guess the cost of a Seq Scan on textbody so much? > > The cost looks about right to me. The cost units are not milliseconds. > >> * Why doesn't it use the index in "id" to fetch the 10 records? > > You haven't *got* an index on id, according to the \d output. Thanks (/me bangs my head against the table). I somehow assumed that "id SERIAL" automatically created it for me. Even enough to not looking for it to confirm. > The only part of your results that looks odd to me is the very high cost > estimate for the bitmapscan: > >> -> Bitmap Heap Scan on textbody (cost=267377.23..269147.80 >> rows=36771 width=5) (actual time=15.763..30.576 rows=37133 loops=1) >> Recheck Cond: (textbody_body_fts @@ >> to_tsquery('commonterm'::text)) >> -> Bitmap Index Scan on textbody_tfs_idx >> (cost=0.00..267368.04 rows=36771 width=0) (actual time=15.419..15.419 >> rows=37134 loops=1) >> Index Cond: (textbody_body_fts @@ >> to_tsquery('commonterm'::text)) > > When I try this with a 64K-row table having 'commonterm' in half of the > rows, what I get is estimates of 1530 cost units for the seqscan and > 1405 for the bitmapscan (so it prefers the latter). It will switch over > to using an index on id if I add one, but that's not the point at the > moment. There's something strange about your tsvector index. Maybe > it's really huge because the documents are huge? huge is a relative term, but length(ts_vector(body)) is about 200 for each document. Is that huge? I can postprocess them a bit to get it down and will eventually do that before going to "production". Thanks alot. -- Jesper -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance