Re: Planner ignoring to use INDEX SCAN

Richard Huxton <dev@xxxxxxxxxxxx> · Fri, 14 Dec 2007 13:00:54 +0000

Ashish Karalkar wrote:
Thanks Richard for your replay,

here is the output..

Richard Huxton <dev@xxxxxxxxxxxx> wrote: Ashish Karalkar wrote:
Richard Huxton  wrote: Ashish Karalkar wrote:
query which was taking seconds on the join of these two table
suddenly started taking 20/25 min
Show the EXPLAIN ANALYSE of your problem query and someone will be able 
to tell you why.

Here is the output from explain analyse:

Actually, this is the output from EXPLAIN not EXPLAIN ANALYSE. It 
doesn't show what actually happened, just what the planner thought was 
going to happen.

Its from EXPLAIN ANALYSE

No it's not, because that shows an extra set of figures. If you see 
below there is a cost for each stage and an actual time too.

EXPLAIN ANALYSE SELECT d.id, l.name FROM items.documents d JOIN 
lookups.document_class l ON d.class=l.id;
                                                      QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=1.16..14.00 rows=162 width=19) (actual 
time=0.100..0.392 rows=162 loops=1)
   Hash Cond: ((d.class)::text = (l.id)::text)
   ->  Seq Scan on documents d  (cost=0.00..10.62 rows=162 width=9) 
(actual time=0.024..0.121 rows=162 loops=1)
   ->  Hash  (cost=1.07..1.07 rows=7 width=20) (actual 
time=0.029..0.029 rows=8 loops=1)
         ->  Seq Scan on document_class l  (cost=0.00..1.07 rows=7 
width=20) (actual time=0.008..0.015 rows=8 loops=1)
 Total runtime: 0.506 ms
(6 rows)

Are the row-estimates roughly accurate?

Yes Row count of sms_new is approx. same

OK, that's good.

table structures are more or less same with delivery being parent and sms_new being child having index on deliveryid in both tables.

 HashAggregate  (cost=6153350.21..6153352.38 rows=174 width=32)
   ->  Hash Join  (cost=218058.30..6153259.97 rows=6016 width=32)
         Hash Cond: ("outer".deliveryid = "inner".deliveryid)
         ->  Seq Scan on sms_new  (cost=0.00..5240444.80 rows=138939341 width=8)
               Filter: ((otid)::text !~~ 'ERROR%'::text)
         ->  Hash  (cost=218057.87..218057.87 rows=174 width=32)

Well, it knows that it's going to be expensive (cost=5240444.80). Since 
it thinks you'll only get 174 rows from the other side and 6016 
matching, I can't see how an index could be calculated as more expensive.

Try issuing ENABLE seq_scan=off and re-running the EXPLAIN, let's see 
what cost that comes up with.

here is explain out put after setting enable_seqscan=off

 HashAggregate  (cost=27729224.21..27729226.21 rows=160 width=32)
   ->  Nested Loop  (cost=2534.67..27729143.31 rows=5393 width=32)
         ->  Bitmap Heap Scan on delivery  (cost=2094.41..216143.78 rows=160 width=32)

         ->  Bitmap Heap Scan on sms_new  (cost=440.26..171369.61 rows=46931 width=8)

Well, the estimated cost for this one is up to 27 million from the 
previous 6 million. It's doing two bitmap scans and then the nested loop 
which is what's pushing the cost up.

Can you post the query too?

--
  Richard Huxton
  Archonet Ltd

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

              http://archives.postgresql.org/