I am trying to insert rows that don't already exist from a temp table into another table. I am using a LEFT JOIN on all the columns and checking for nulls in the base table to know which rows to insert. The problem is that the planner is choosing a nested loop plan which is very slow over the much faster (~40x) hash join. What's interesting is that all the row estimates appear to be fairly accurate. I'm wondering if it has something to do with the GIN indexes on bigint_array_1 and bigint_array_2. Perhaps it misestimates the cost of each index scan? Postgres 9.3.10 on 2.6.32-573.18.1.el6.x86_64 GNU/Linux - base_table has been VACUUM ANALYZED - base_table has GIN indexes on bigint_array_1 and bigint_array_2 - base_table has btree index on id - base_table is 700k rows - temp_table is 4k rows - the bigint arrays are type bigint[] and contain 0 to 5 elements, with a median of 1 element - the time difference between nested loop vs hash join is not based on the cache, I can reproduce it in either order test_db=# BEGIN; BEGIN test_db=# EXPLAIN (ANALYZE, BUFFERS) INSERT INTO base_table ( bigint_array_1, bigint_array_2, id ) ( SELECT s.bigint_array_1, s.bigint_array_2, s.id FROM temp_rows_to_insert s LEFT JOIN base_table t ON s.bigint_array_1 = t.bigint_array_1 AND s.bigint_array_2 = t.bigint_array_2 AND s.id = t.id WHERE s.bigint_array_1 IS NOT NULL AND t.bigint_array_1 IS NULL AND s.bigint_array_2 IS NOT NULL AND t.bigint_array_2 IS NULL AND s.id IS NOT NULL AND t.id IS NULL ); QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Insert on base_table (cost=2.97..67498.30 rows=8412 width=72) (actual time=40463.771..40463.771 rows=0 loops=1) Buffers: shared hit=2373347 read=129 dirtied=129, local hit=122 -> Nested Loop Anti Join (cost=2.97..67498.30 rows=8412 width=72) (actual time=13.694..40410.273 rows=4389 loops=1) Buffers: shared hit=2338092, local hit=122 -> Seq Scan on temp_rows_to_insert s (cost=0.00..219.60 rows=9614 width=72) (actual time=0.607..4.746 rows=4389 loops=1) Filter: ((bigint_array_1 IS NOT NULL) AND (bigint_array_2 IS NOT NULL) AND (id IS NOT NULL)) Buffers: local hit=122 -> Bitmap Heap Scan on base_table t (cost=2.97..6.98 rows=1 width=74) (actual time=9.201..9.201 rows=0 loops=4389) Recheck Cond: ((s.bigint_array_2 = bigint_array_2) AND (s.bigint_array_1 = bigint_array_1)) Filter: (s.id = id) Buffers: shared hit=2333695 -> BitmapAnd (cost=2.97..2.97 rows=1 width=0) (actual time=9.199..9.199 rows=0 loops=4389) Buffers: shared hit=2333638 -> Bitmap Index Scan on base_table_bigint_array_2_idx (cost=0.00..1.04 rows=3 width=0) (actual time=2.582..2.582 rows=290 loops=4389) Index Cond: (s.bigint_array_2 = bigint_array_2) Buffers: shared hit=738261 -> Bitmap Index Scan on base_table_bigint_array_1_idx (cost=0.00..1.68 rows=3 width=0) (actual time=6.608..6.608 rows=2 loops=4389) Index Cond: (s.bigint_array_1 = bigint_array_1) Buffers: shared hit=1595377 Total runtime: 40463.879 ms (20 rows) test_db=# rollback; ROLLBACK test_db=# BEGIN; BEGIN test_db=# SET enable_nestloop = false; SET test_db=# EXPLAIN (ANALYZE, BUFFERS) INSERT INTO base_table ( bigint_array_1, bigint_array_2, id ) ( SELECT s.bigint_array_1, s.bigint_array_2, s.id FROM temp_rows_to_insert s LEFT JOIN base_table t ON s.bigint_array_1 = t.bigint_array_1 AND s.bigint_array_2 = t.bigint_array_2 AND s.id = t.id WHERE s.bigint_array_1 IS NOT NULL AND t.bigint_array_1 IS NULL AND s.bigint_array_2 IS NOT NULL AND t.bigint_array_2 IS NULL AND s.id IS NOT NULL AND t.id IS NULL ); QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------ Insert on base_table (cost=31711.89..71625.39 rows=8412 width=72) (actual time=1838.650..1838.650 rows=0 loops=1) Buffers: shared hit=50013 read=123 dirtied=410, local hit=122 -> Hash Anti Join (cost=31711.89..71625.39 rows=8412 width=72) (actual time=1798.774..1812.872 rows=4389 loops=1) Hash Cond: ((s.bigint_array_1 = t.bigint_array_1) AND (s.bigint_array_2 = t.bigint_array_2) AND (s.id = t.id)) Buffers: shared hit=14761 dirtied=287, local hit=122 -> Seq Scan on temp_rows_to_insert s (cost=0.00..219.60 rows=9614 width=72) (actual time=0.046..3.033 rows=4389 loops=1) Filter: ((bigint_array_1 IS NOT NULL) AND (bigint_array_2 IS NOT NULL) AND (id IS NOT NULL)) Buffers: local hit=122 -> Hash (cost=18131.96..18131.96 rows=775996 width=74) (actual time=1798.528..1798.528 rows=768415 loops=1) Buckets: 131072 Batches: 1 Memory Usage: 84486kB Buffers: shared hit=10372 dirtied=287 -> Seq Scan on base_table t (cost=0.00..18131.96 rows=775996 width=74) (actual time=0.007..490.851 rows=768415 loops=1) Buffers: shared hit=10372 dirtied=287 Total runtime: 1843.336 ms (14 rows) test_db=# rollback; ROLLBACK Thanks, Jake -- View this message in context: http://postgresql.nabble.com/Planner-chooses-slow-index-heap-scan-despite-accurate-row-estimates-tp5905357.html Sent from the PostgreSQL - performance mailing list archive at Nabble.com. -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance