Scott Marlowe wrote: > Douglas Alan wrote: >> Okay -- no problem: >> >> set enable_seqscan = on; >> explain analyze select * from maindb_astobject >> where survey_id = 2 >> limit 1000; >> >> "Limit (cost=0.00..48.03 rows=1000 width=78) (actual >> time=84837.835..265938.258 rows=1000 loops=1)" >> " -> Seq Scan on maindb_astobject (cost=0.00..3538556.10 >> rows=73675167 width=78) (actual time=84837.825..265932.121 rows=1000 >> loops=1)" >> " Filter: (survey_id = 2)" >> "Total runtime: 265942.416 ms" >> >> set enable_seqscan = off; >> explain analyze select * from maindb_astobject >> where survey_id = 2 >> limit 1000; >> >> "Limit (cost=0.00..67.37 rows=1000 width=78) (actual >> time=172.248..225.219 rows=1000 loops=1)" >> " -> Index Scan using maindb_astobject_survey_id on >> maindb_astobject (cost=0.00..4963500.87 rows=73675167 width=78) >> (actual time=172.240..221.078 rows=1000 loops=1)" >> " Index Cond: (survey_id = 2)" >> "Total runtime: 227.412 ms" > What was the random_page_cost during these two queries? 4 > Assuming seq_page_cost is 1, Yes, it is. > and random_page_cost was 4 or something, lowering it should > force the move to an index scan. I just tried changing random_page_cost to 1, but the query still does a seq scan. > If you were already at 1.0 or so, then yeah, the cost > estimation is off. Since index scans cost CPU (at least I > think they do), you might try lowering your cpu_* costs to see > if that helps How would lowering random_page_cost and all the cpu costs differ from just increasing seq_page cost? I have to raise seq_page_cost from 1 to 34 to force an index scan. I can't imagine that changing this value so radically be a good idea. Alternatively, if I set random_page_cost to 1, and cpu_tuple_cost, cpu_index_tuple_cost, and cpu_operator_cost all to 0, this still doesn't cause Postgres to do an index scan for this query. >> P.S. Here are the stats on the column. It appears that my recollection >> of 25% of the table matching was a bit off. It's actually 98.5%! That >> might explain more why Postgres wants to do a sequential scan. The >> problem is that still means that it has to scan a million rows >> sequentially before it finds a single matching row, as the matching >> rows are at the end of the database: > Yeah, that's a really skewed distribution. Partitioning may work out, > especially if you often select on that one field. Is there a way for me to alter the statistics table? I tried changing the values in pg_stats, but that table is just a view, so Postgres won't let me do it. pg_statistic, on the other hand, is rather opaque. Alternatively, can I delete the statistics for the column. It's the statistics that are hurting me here. To delete the statistics, I tried setting statistics for the column to 0 and analyzing the column, but that just left the current statistics in place. So I tried setting statistics to 1, but that's one value to many to eliminate this problem! |>ouglas -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general