# neuhauser@xxxxxxxxxx / 2005-07-13 15:58:09 +0200: > # dev@xxxxxxxxxxxx / 2005-07-13 14:09:34 +0100: > > Roman Neuhauser wrote: > > > callrec32=# \d fix.files > > > Table "fix.files" > > > Column | Type | Modifiers > > > --------+------------------------+----------- > > > dir | character varying(255) | > > > base | character varying(255) | > > > Indexes: > > > "base_storename_idx" btree (base, ((((dir)::text || '/'::text) || > > > (base)::text))) > > > "ff_baseonly_idx" btree (base) > > > "ff_storename_idx" btree (((((dir)::text || '/'::text) || > > > (base)::text))) > > > > > > callrec32=# explain select fd.base from fix.dups fd join fix.files ff > > > using (base); > > > QUERY PLAN > > > ---------------------------------------------------------------------------- > > > Hash Join (cost=5340.00..292675.06 rows=176161 width=44) > > > Hash Cond: (("outer".base)::text = ("inner".base)::text) > > > -> Seq Scan on files ff (cost=0.00..117301.58 rows=5278458 > > > width=41) > > > -> Hash (cost=3436.60..3436.60 rows=176160 width=44) > > > -> Seq Scan on dups fd (cost=0.00..3436.60 rows=176160 > > > width=44) > > > (5 rows) > > > What happens to the plan if you SET enable_seqscan=false; first? It's > > presumably getting the row-estimate right, so unless there's terrible > > correlation on "base" in the files table I can only assume it's getting > > the cost estimates horribly wrong. > > callrec32=# SET enable_seqscan=false; > SET > callrec32=# explain select fd.base from fix.dups fd join fix.files ff using (base); > QUERY PLAN > -------------------------------------------------------------------------------------------- > Nested Loop (cost=0.00..1066990.93 rows=176161 width=44) > -> Index Scan using dups_base_key on dups fd (cost=0.00..5570.86 rows=176160 width=44) > -> Index Scan using ff_baseonly_idx on files ff (cost=0.00..6.01 rows=1 width=41) > Index Cond: (("outer".base)::text = (ff.base)::text) > (4 rows) BTW, this query or its equivalent is what I'm really after: callrec32=# explain select c1.storename from fix.dups fd join calls c1 on (fd.base = basename(c1.storename) and c1.iscouple = '1') where not exists (select 1 from fix.files ff where c1.storename = ff.dir || '/' || ff.base); QUERY PLAN -------------------------------------------------------------------------------------------------------- Hash Join (cost=7474.26..23127970.91 rows=2354719 width=60) Hash Cond: ((basename("outer".storename))::text = ("inner".base)::text) -> Index Scan using calls2_iscouple_idx on calls c1 (cost=0.00..22982439.69 rows=2354719 width=60) Filter: ((iscouple = 1::smallint) AND (NOT (subplan))) SubPlan -> Index Scan using ff_storename_idx on files ff (cost=0.00..88570.16 rows=26393 width=0) Index Cond: (($0)::text = (((dir)::text || '/'::text) || (base)::text)) -> Hash (cost=5570.86..5570.86 rows=176160 width=44) -> Index Scan using dups_base_key on dups fd (cost=0.00..5570.86 rows=176160 width=44) (9 rows) calls has (among others): "calls2_basename_storename_idx" btree (basename(storename), storename) WHERE (iscouple = (1)::smallint) "calls2_storename_idx" btree (storename) WHERE (iscouple = (1)::smallint) "calls2_iscouple_idx" btree (id) WHERE (iscouple = (1)::smallint) WHy does it use the calls2_iscouple_idx index when calls.id isn't used anywhere in the query? I would guess that calls2_storename_idx would actually be more useful. -- How many Vietnam vets does it take to screw in a light bulb? You don't know, man. You don't KNOW. Cause you weren't THERE. http://bash.org/?255991 ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend