Abysmal hash join

Florian Weimer <fweimer@xxxxxx> · Mon, 11 Sep 2006 15:48:03 +0200

Hi,

for this simple join of two tables,

SELECT * FROM large_rel n, smaller_rel a
  WHERE n.field_1 = a.field_2 AND a.key = '127.0.0.1';

PostgreSQL 8.1.4 chooses an extremely bad query plan:

 Hash Join  (cost=283.45..8269374.38 rows=14137 width=94)
   Hash Cond: ("outer".field_1 = "inner".field_2)
   ->  Seq Scan on large_rel n  (cost=0.00..6760690.04 rows=301651904 width=52)
   ->  Hash  (cost=283.27..283.27 rows=74 width=42)
         ->  Bitmap Heap Scan on smaller_rel a  (cost=2.26..283.27 rows=74 width=42)
               Recheck Cond: (key = '127.0.0.1'::inet)
               ->  Bitmap Index Scan on smaller_rel_1_key  (cost=0.00..2.26 rows=74 width=0)
                     Index Cond: (key = '127.0.0.1'::inet)

Note the sequential scan over the whole large_rel table (and the
corresponding row estimate is roughly correct).

If I turn off hash joins, I get this plan, which actually completes in
finite time:

 Nested Loop  (cost=2005.35..46955689.59 rows=14137 width=94) (actual time=0.325..0.678 rows=12 loops=1)
   ->  Bitmap Heap Scan on smaller_rel a  (cost=2.26..283.27 rows=74 width=42) (actual time=0.132..0.133 rows=1 loops=1)
         Recheck Cond: (key = '127.0.0.1'::inet)
         ->  Bitmap Index Scan on smaller_rel_1_key  (cost=0.00..2.26 rows=74 width=0) (actual time=0.095..0.095 rows=1 loops=1)
               Index Cond: (key = '127.0.0.1'::inet)
   ->  Bitmap Heap Scan on large_rel n  (cost=2003.09..632110.78 rows=193739 width=52) (actual time=0.182..0.501 rows=12 loops=1)
         Recheck Cond: (n.field_1 = "outer".field_2)
         ->  Bitmap Index Scan on large_rel_1_field_1  (cost=0.00..2003.09 rows=193739 width=0) (actual time=0.148..0.148 rows=12 loops=1)
               Index Cond: (n.field_1 = "outer".field_2)

The row estimate for

  SELECT * FROM smaller_rel a WHERE a.key = '127.0.0.1';

is somewhat off:  

 Bitmap Heap Scan on smaller_rel a  (cost=2.26..283.27 rows=74 width=42) (actual time=0.134..0.135 rows=1 loops=1)
   Recheck Cond: (key = '127.0.0.1'::inet)
   ->  Bitmap Index Scan on smaller_rel_1_key  (cost=0.00..2.26 rows=74 width=0) (actual time=0.108..0.108 rows=1 loops=1)
         Index Cond: (key = '127.0.0.1'::inet)

However, I can't believe that the hash join would be faster even if
there where 74 matching rows in smaller_rel instead of just one.  The
estimate decreases when I increase the portion of smaller_rel which is
scanned by ANALYZE (to something like 10% of the table), but this
doesn't look like a solution.

Any suggestions?

(The queries have been pseudonzmized and may contain typos.)
-- 
Florian Weimer                <fweimer@xxxxxx>
BFK edv-consulting GmbH       http://www.bfk.de/
Durlacher Allee 47            tel: +49-721-96201-1
D-76131 Karlsruhe             fax: +49-721-96201-99