"Christian Roche" <Christian.Roche@xxxxxxxxxxxxx> writes: > Now when I select a subset of the possible event IDs in the big table, PG uses the appropriate index: > select * > from mixpanel_events_201409 > where event_id in (3, 4, 5, 6, 7, 8, 9, 10, 11, 373, 375, 376, 318); > Bitmap Heap Scan on mixpanel_events_201409 (cost=7663.36..1102862.70 rows=410022 width=949) > Recheck Cond: (event_id = ANY ('{3,4,5,6,7,8,9,10,11,373,375,376,318}'::integer[])) > -> Bitmap Index Scan on mixpanel_idx_event_201409 (cost=0.00..7560.85 rows=410022 width=0) > Index Cond: (event_id = ANY ('{3,4,5,6,7,8,9,10,11,373,375,376,318}'::integer[])) > But when I try to join the lookup table and select from it, the index is dismissed for a full table scan with a catastrophic effect on performance: > select * > from mixpanel_events_201409 mp > inner join mixpanel_event_list ev on ( ev.id = mp.event_id ) > where ev.id in (3, 4, 5, 6, 7, 8, 9, 10, 11, 373, 375, 376, 318); > Hash Join (cost=20.73..2892183.32 rows=487288 width=1000) > Hash Cond: (mp.event_id = ev.id) > -> Seq Scan on mixpanel_events_201409 mp (cost=0.00..2809276.70 rows=20803470 width=949) > -> Hash (cost=20.57..20.57 rows=13 width=51) > -> Seq Scan on mixpanel_event_list ev (cost=0.00..20.57 rows=13 width=51) > Filter: (id = ANY ('{3,4,5,6,7,8,9,10,11,373,375,376,318}'::integer[])) Given the estimated costs and rowcounts here, I'm far from convinced that the planner made the wrong decision. You seem to be expecting that it will go for a nestloop plan that would require 13 separate indexscans of the large table. Those are unlikely to be only 1/13th the cost of the unified bitmap scan with =ANY; there's going to be overhead from repeated work. If there's say a factor of 2 penalty for the repeated scans, that'd be plenty enough to push the cost of that plan to be more than the hashjoin. If, indeed, the hashjoin is slower, that may suggest that you need to dial down random_page_cost to better represent your environment. But you should be wary of making such an adjustment on the basis of a single example; you might find that it makes other plan choices worse. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance