On 1/12/06 12:23 AM, "Jonel Rienton" <jonel@xxxxxxxxxxxxxxxx> wrote: > Resending sample query, darn where clause didn't wrap > > select a.*,b.* from a > left outer join b on a.id = b.a_id > where b.id is null; I tried something along those lines a while back, and it was orders of magnitude slower. The above produces: explain select address_key, address from addresses left outer join messages on addresses.address_key=originator where originator is null; QUERY PLAN ---------------------------------------------------------------------------- --------------------- Merge Left Join (cost=35684870.14..38457295.97 rows=4090203 width=40) Merge Cond: ("outer".address_key = "inner".originator) Filter: ("inner".originator IS NULL) -> Index Scan using addresses_pkey on addresses (cost=0.00..97213.17 rows=4090203 width=40) -> Sort (cost=35684870.14..36129462.74 rows=177837040 width=11) Sort Key: messages.originator -> Seq Scan on messages (cost=0.00..7215718.40 rows=177837040 width=11) This appears to be very inefficient. B is almost two orders of magnitude larger than A. C is about 3-4 times as big as B (record count). My statement (with the same single 'B' table as above) produces: narc=> explain select address_key, address from addresses where ( not exists(select 1 from messages where originator=addresses.address_key limit 1) ); QUERY PLAN ---------------------------------------------------------------------------- -------------------------------- Seq Scan on addresses (cost=0.00..3398462.98 rows=2045102 width=40) Filter: (NOT (subplan)) SubPlan -> Limit (cost=0.00..0.81 rows=1 width=0) -> Index Scan using messages_i_orig_mdate on messages (cost=0.00..35148.46 rows=43301 width=0) Index Cond: (originator = $0) Which seems like it should be much more efficient. Wes