Re: How does the query planner make its plan?

"Scott Marlowe" <scott.marlowe@xxxxxxxxx> · Tue, 6 Nov 2007 09:49:51 -0600

On 11/6/07, Christian Schröder <cs@xxxxxxxxx> wrote:
> Hi list,
> once again I do not understand how the query planner works and why it
> apparently does not find the best result.
> I have a table with about 125 million rows. There is a char(5) column
> with a (non-unique) index. When I try to find the distinct values in
> this column using the following sql statement:
>
> select distinct exchange from foo
>
> the query planner chooses not to use the index, but performs a
> sequential scan. When I disfavour the use of sequential scans ("set
> enable_seqscan = off") the performance is more than 6 times better. Why
> does the query planner's plan go wrong? The table has been vacuum
> analyzed just before I ran the queries.

Does the sequential scan stay slow the second time you run it?  It's
possible that if you always run the seq scan first, then the index
scan second, the index scan will benefit from caching.

Assuming that repeated runs of each type shows the index scan to be
faster, then it's likely that it is both fitting into memory AND that
the table data is better ordered than the db thinks it is.

Have you tried upping the stats target on the exchange column and
re-running analyze to see if that helps?

Generally, random_page_cost should not really be 1 unless you're
running a db that wholly fits into memory or is on a SSD.

Note that even then index fetches cost more than seq scan fetches
because with an index fetch you hit the index THEN hit the table (two
fetches) where in a seq fetch you just hit the table.

I'd also try clustiner the table on exchange.

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
       subscribe-nomail command to majordomo@xxxxxxxxxxxxxx so that your
       message can get through to the mailing list cleanly