Here is a follow-up on the step-by-step procedure proposed by PT #1 - setup postgresql planner's cost estimate settings for my hardware. ---------------------------------------------------------------------------- ---------- Current parameters values described in section 18.7.2 haven't been changed except for the effective_cache_size seq_page_cost = 1 random_page_cost = 4 cpu_tuple_cost = 0.01 cpu_index_tuple_cost = 0.005 cpu_operator_cost = 0.0025 effective_cache_size = 10GB I did a bunch of tests on frequently used queries to see how well they perform - using SET enable_seqscan = ON/OFF. As described earlier in this tread, the planner use Seq Scan on tables even if using an Index Scan is in this case 5 times faster! Here are the logs of EXPLAIN ANALYSE on a query... osmdump=# SET enable_seqscan = ON; osmdump=# EXPLAIN ANALYSE SELECT user_id, id AS changeset_id,closed FROM changesets WHERE changesets.user_id IN(SELECT id FROM sample.users); ---------------------------------------------------------------------------- ----------------------------------------------------- Hash Semi Join (cost=21.50..819505.27 rows=726722 width=24) (actual time=1574.914..7444.938 rows=338568 loops=1) Hash Cond: (changesets.user_id = users.id) -> Seq Scan on changesets (cost=0.00..745407.22 rows=25139722 width=24) (actual time=0.002..4724.578 rows=25133929 loops=1) -> Hash (cost=14.00..14.00 rows=600 width=8) (actual time=0.165..0.165 rows=600 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 24kB -> Seq Scan on users (cost=0.00..14.00 rows=600 width=8) (actual time=0.003..0.073 rows=600 loops=1) Total runtime: 7658.715 ms (7 rows) osmdump=# SET enable_seqscan = OFF; osmdump=# EXPLAIN ANALYSE SELECT user_id, id AS changeset_id,closed FROM changesets WHERE changesets.user_id IN(SELECT id FROM sample.users); ---------------------------------------------------------------------------- -------------------------------------------------------------------- Nested Loop (cost=10000000015.94..10001072613.45 rows=726722 width=24) (actual time=0.268..1490.515 rows=338568 loops=1) -> HashAggregate (cost=10000000015.50..10000000021.50 rows=600 width=8) (actual time=0.207..0.531 rows=600 loops=1) -> Seq Scan on users (cost=10000000000.00..10000000014.00 rows=600 width=8) (actual time=0.003..0.037 rows=600 loops=1) -> Index Scan using changesets_useridndx on changesets (cost=0.44..1775.54 rows=1211 width=24) (actual time=0.038..2.357 rows=564 loops=600 Index Cond: (user_id = users.id) Total runtime: 1715.517 ms (6 rows) #2 - Run ANALYZE DATABASE and look at performance/planning improvement. ---------------------------------------------------------------------------- ---------- I ran ANALYZE DATABASE then rerun the query. It did not produce any significant improvement according to the EXPLAIN ANALYSE below... osmdump=# SET enable_seqscan = ON; osmdump=# EXPLAIN ANALYSE SELECT user_id, id AS changeset_id,closed FROM changesets WHERE changesets.user_id IN(SELECT id FROM sample.users); ---------------------------------------------------------------------------- ----------------------------------------------------- Hash Semi Join (cost=21.50..819511.42 rows=729133 width=24) (actual time=1538.100..7307.743 rows=338568 loops=1) Hash Cond: (changesets.user_id = users.id) -> Seq Scan on changesets (cost=0.00..745390.84 rows=25138084 width=24) (actual time=0.027..4620.691 rows=25133929 loops=1) -> Hash (cost=14.00..14.00 rows=600 width=8) (actual time=0.300..0.300 rows=600 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 24kB -> Seq Scan on users (cost=0.00..14.00 rows=600 width=8) (actual time=0.022..0.187 rows=600 loops=1) Total runtime: 7519.254 ms (7 rows) osmdump=# SET enable_seqscan = OFF; osmdump=# EXPLAIN ANALYSE SELECT user_id, id AS changeset_id,closed FROM changesets WHERE changesets.user_id IN(SELECT id FROM sample.users); ---------------------------------------------------------------------------- -------------------------------------------------------------------- Nested Loop (cost=10000000015.94..10001090810.49 rows=729133 width=24) (actual time=0.268..1466.248 rows=338568 loops=1) -> HashAggregate (cost=10000000015.50..10000000021.50 rows=600 width=8) (actual time=0.205..0.530 rows=600 loops=1) -> Seq Scan on users (cost=10000000000.00..10000000014.00 rows=600 width=8) (actual time=0.003..0.035 rows=600 loops=1) -> Index Scan using changesets_useridndx on changesets (cost=0.44..1805.83 rows=1215 width=24) (actual time=0.036..2.314 rows=564 loops=600) Index Cond: (user_id = users.id) Total runtime: 1677.447 ms (6 rows) #3 - Run EXPLAIN ANALYZE and look for discrepancies between the estimated and actual times ---------------------------------------------------------------------------- ---------- Looking at above results, there are obvious discrepancies between expected/actual rows and time! I dug a bit by exploring/trying to understand the different concepts explained in... http://www.postgresql.org/docs/9.4/static/planner-stats.html http://www.postgresql.org/docs/8.1/static/planner-stats-details.html http://www.postgresql.org/docs/9.2/static/view-pg-stats.html Concerning discrepancies between the actual number of rows and predicted value, I looked at what pg_stats was saying about user_id in table changesets. Here are the values provided to the planner... Average_width=8 histogram_bounds: the size of the bins varies between 50 and 150000, which make sense because if I had divided the column's values into groups of approximately equal population, I would have produced bins between 1 and 100000 (if sorted by frequency) n_distinct= 20686 (there is actually 464858 distinct values for user_id in the table) most_common_vals: values make sense (I checked the frequency count of a couple most common users_id) correlation=0.617782 (?) most_common_elems, most_common_elem_freqs and elem_count_histogram were empty At this point, I wonder if the assumptions behind the planner's statistics may produce such problems since the distribution of my data is not uniform but follows a power law (some user_id would return millions of records while others only one). This is the farthest I can go at this point. Maybe someone can provide me with more explanations regarding planner's behavior and ways to go further to make it work properly? Best regards, Daniel -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general