--- Simon Riggs <simon@xxxxxxxxxxxxxxx> skrev: > > The distribution of rows with those values also makes a difference to > the results. ANALYZE assumes that all values are randomly distributed > within the table, so if the values are clumped together for whatever > reason the ndistinct calc is less likely to take that into account. This is an important factor. As a summary, one table is defined like this: Table "public.step_result_subset" Column | Type | Modifiers -------------+---------+----------- id | integer | not null uut_result | integer | step_parent | integer | Indexes: "step_result_subset_pkey" PRIMARY KEY, btree (id) "step_result_subset_parent_key" btree (step_parent) "step_result_uut_result_idx" btree (uut_result) The values in step_result_subset.uut_result is clumped together (between 10 and 1000 of same value, and also increasing through the table). The rows where step_result_subset.step_parent is 0 (a special case) is distributed within the table. Even when i set statistics on test_result_subset.uut_result to 1000 7.4.14 picks a better plan than 8.2.0 for some returned datasets. The best results for both 7.4.14 and 8.2.0 is if i remove the index step_result_subset_parent_key. I will have to check if other queries which uses step_result_subset.step_parent will be "broken" by removing the index but i think it should be ok. I have gotten some ideas from this thread , read some more documentation, read the archives, and tested other queries and will try to speed up some more advance queries. Thanks everyone. best regards Rolf Østvik __________________________________________________ Bruker du Yahoo!? Lei av spam? Yahoo! Mail har den beste spambeskyttelsen http://no.mail.yahoo.com