PostgreSQL 9.0.2 Mac OS X Server 10.6.8 Autovacuum is on, and I have a script that runs vacuum analyze verbose every night along with the backup. I have a situation where I'm experiencing a seq scan on a table with almost 3M rows when my condition is based on a subquery. A google search turned up a way to prevent flattening the subquery into a join using OFFSET 0. This does work, reducing the query from around 1s to around 250ms, most of which is the subquery. My question is why does it do a seq scan when it flattens this subquery into a JOIN? Is it because the emsg_messages table is around 1M rows? Are there some guidelines to when the planner will prefer not to use an available index? I just had a look through postgresql.conf and noticed that I forgot to set effective_cache_size to something reasonable for a machine with 16GB of memory. Would the default setting of 128MB cause this behavior? I can't bounce the production server midday to test that change. EXPLAIN ANALYZE SELECT ema.message_id, ema.email_address_id, ema.address_type FROM emsg_message_addresses ema WHERE ema.message_id IN ( SELECT id FROM emsg_messages msg WHERE msg.account_id = 314 AND msg.outgoing = FALSE AND msg.message_type = 1 AND msg.spam_level < 2 AND msg.deleted_at IS NULL AND msg.id NOT IN ( SELECT emf.message_id FROM emsg_message_folders emf where emf.account_id = 314 ) ) QUERY PLAN Hash Semi Join (cost=84522.74..147516.35 rows=49545 width=12) (actual time=677.058..1083.685 rows=2 loops=1) Hash Cond: (ema.message_id = msg.id) -> Seq Scan on emsg_message_addresses ema (cost=0.00..53654.78 rows=2873478 width=12) (actual time=0.020..424.241 rows=2875437 loops=1) -> Hash (cost=84475.45..84475.45 rows=3783 width=4) (actual time=273.392..273.392 rows=1 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 1kB -> Bitmap Heap Scan on emsg_messages msg (cost=7979.35..84475.45 rows=3783 width=4) (actual time=273.224..273.387 rows=1 loops=1) Recheck Cond: (account_id = 314) Filter: ((NOT outgoing) AND (deleted_at IS NULL) AND (spam_level < 2) AND (NOT (hashed SubPlan 1)) AND (message_type = 1)) -> Bitmap Index Scan on index_emsg_messages_on_account_id (cost=0.00..867.98 rows=34611 width=0) (actual time=9.633..9.633 rows=34997 loops=1) Index Cond: (account_id = 314) SubPlan 1 -> Bitmap Heap Scan on emsg_message_folders emf (cost=704.90..7022.51 rows=35169 width=4) (actual time=5.684..38.016 rows=34594 loops=1) Recheck Cond: (account_id = 314) -> Bitmap Index Scan on index_emsg_message_folders_on_account_id (cost=0.00..696.10 rows=35169 width=0) (actual time=5.175..5.175 rows=34594 loops=1) Index Cond: (account_id = 314) Total runtime: 1083.890 ms EXPLAIN ANALYZE SELECT ema.message_id, ema.email_address_id, ema.address_type FROM emsg_message_addresses ema WHERE ema.message_id IN ( SELECT id FROM emsg_messages msg WHERE msg.account_id = 314 AND msg.outgoing = FALSE AND msg.message_type = 1 AND msg.spam_level < 2 AND msg.deleted_at IS NULL AND msg.id NOT IN ( SELECT emf.message_id FROM emsg_message_folders emf where emf.account_id = 314 ) OFFSET 0 ) QUERY PLAN Nested Loop (cost=84524.89..87496.74 rows=2619 width=12) (actual time=273.409..273.412 rows=2 loops=1) -> HashAggregate (cost=84524.89..84526.89 rows=200 width=4) (actual time=273.345..273.346 rows=1 loops=1) -> Limit (cost=7979.36..84477.60 rows=3783 width=4) (actual time=273.171..273.335 rows=1 loops=1) -> Bitmap Heap Scan on emsg_messages msg (cost=7979.36..84477.60 rows=3783 width=4) (actual time=273.169..273.333 rows=1 loops=1) Recheck Cond: (account_id = 314) Filter: ((NOT outgoing) AND (deleted_at IS NULL) AND (spam_level < 2) AND (NOT (hashed SubPlan 1)) AND (message_type = 1)) -> Bitmap Index Scan on index_emsg_messages_on_account_id (cost=0.00..867.99 rows=34612 width=0) (actual time=9.693..9.693 rows=34998 loops=1) Index Cond: (account_id = 314) SubPlan 1 -> Bitmap Heap Scan on emsg_message_folders emf (cost=704.90..7022.51 rows=35169 width=4) (actual time=5.795..39.420 rows=34594 loops=1) Recheck Cond: (account_id = 314) -> Bitmap Index Scan on index_emsg_message_folders_on_account_id (cost=0.00..696.10 rows=35169 width=0) (actual time=5.266..5.266 rows=34594 loops=1) Index Cond: (account_id = 314) -> Index Scan using index_emsg_message_addresses_on_message_id on emsg_message_addresses ema (cost=0.00..14.69 rows=13 width=12) (actual time=0.056..0.058 rows=2 loops=1) Index Cond: (ema.message_id = msg.id) Total runtime: 273.679 ms Jim Crate -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance