On 11/6/07, Christian Schröder <cs@xxxxxxxxx> wrote: > Hi list, > once again I do not understand how the query planner works and why it > apparently does not find the best result. > I have a table with about 125 million rows. There is a char(5) column > with a (non-unique) index. When I try to find the distinct values in > this column using the following sql statement: > > select distinct exchange from foo > > the query planner chooses not to use the index, but performs a > sequential scan. When I disfavour the use of sequential scans ("set > enable_seqscan = off") the performance is more than 6 times better. Why > does the query planner's plan go wrong? The table has been vacuum > analyzed just before I ran the queries. Does the sequential scan stay slow the second time you run it? It's possible that if you always run the seq scan first, then the index scan second, the index scan will benefit from caching. Assuming that repeated runs of each type shows the index scan to be faster, then it's likely that it is both fitting into memory AND that the table data is better ordered than the db thinks it is. Have you tried upping the stats target on the exchange column and re-running analyze to see if that helps? Generally, random_page_cost should not really be 1 unless you're running a db that wholly fits into memory or is on a SSD. Note that even then index fetches cost more than seq scan fetches because with an index fetch you hit the index THEN hit the table (two fetches) where in a seq fetch you just hit the table. I'd also try clustiner the table on exchange. ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@xxxxxxxxxxxxxx so that your message can get through to the mailing list cleanly