Hello, using a highly surjective left (or inner) join to a table reveals data loss if the hash join method is used. Here, highly surjective means I have a table with about 1.4 million tuples which map to a table with about 40000 tuples. Now here's the explanation: qaos=# explain select anfs.anf_id from anfs left join groups USING (group_id) where anfs.degree=3 and not groups.abelian_p; QUERY PLAN ------------------------------------------------------------------------------------------------------ Hash Join (cost=8111.80..103961.04 rows=299036 width=521) Hash Cond: ("outer".group_id = "inner".group_id) -> Bitmap Heap Scan on anfs (cost=2048.58..71382.50 rows=325594 width=216) Recheck Cond: (degree = 3) -> Bitmap Index Scan on anfs_degree (cost=0.00..2048.58 rows=325594 width=0) Index Cond: (degree = 3) -> Hash (cost=4275.79..4275.79 rows=40172 width=313) -> Bitmap Heap Scan on groups (cost=346.60..4275.79 rows=40172 width=313) Filter: (NOT abelian_p) -> Bitmap Index Scan on groups_not_properties (cost=0.00..346.60 rows=40119 width=0) Index Cond: (abelian_p = false) (11 rows) The query result is: qaos=# select anfs.anf_id from anfs left join groups USING (group_id) where anfs.degree=3 and not groups.abelian_p; anf_id -------- (0 rows) Now whenever other join-methods (like merge join) are used, there are definitely tuples in the result: qaos=# select anfs.anf_id from anfs left join groups USING (group_id) where anfs.degree=3 and not groups.abelian_p limit 20; anf_id -------- 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 (20 rows) qaos=# explain select anfs.anf_id from anfs left join groups USING (group_id) where anfs.degree=3 and not groups.abelian_p limit 20; QUERY PLAN ------------------------------------------------------------------------------------------------ Limit (cost=0.00..138.09 rows=20 width=8) -> Nested Loop (cost=0.00..2064738.73 rows=299036 width=8) -> Index Scan using anfs_degree on anfs (cost=0.00..1078774.74 rows=325594 width=16) Index Cond: (degree = 3) -> Index Scan using groups_group_id on groups (cost=0.00..3.02 rows=1 width=8) Index Cond: ("outer".group_id = groups.group_id) Filter: (NOT abelian_p) (7 rows) The canonical step in this misery, SET enable_hashjoin TO off; makes queries work again, but the performance really suffers from it. Greetings, Sebastian ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq