Use order by clause, got index scan involved

高健 <luckyjackgao@xxxxxxxxx> · Thu, 8 Nov 2012 15:41:00 +0800

Hi all:

What confused me is that:  When I
select data using order by  clause, I got
the following execution plan:

postgres=# set session enable_indexscan=true;                                                                                               

SET                                                                                                                                         

postgres=# explain SELECT * FROM pg_proc ORDER BY oid;                                                                                          

QUERY PLAN                                                                                                                               

----------------------------------------------------------------------------------------                                                                                             

 Index Scan using
pg_proc_oid_index on pg_proc 
(cost=0.00..321.60 rows=2490 width=552)                                                                                                                                       

(1 row)                                                                                                                                 

postgres=#

My Question is : 

 If I want to find record using the where clause which hold the id column, 
the index scan might be used.

But  I just want to get all the  records on sorted output format,  Why index scan can be used here?

I can’t imagine  that:

Step 1 Index is read into memory, then for each tuple in it, 

Step 2 Then we got  the address of  related data block, and then access the data
block .

Step 2 will be repeated for many times. I think it is not efficient.

But comparing with sort , I got that 
even index scan with all the entry , the cost is still lower than sort
operation:

  postgres=# set session enable_indexscan=false;

  SET

  postgres=# explain SELECT * FROM pg_proc ORDER BY oid;

  QUERY PLAN                            

  -------------------------------------------------------------------

   Sort  (cost=843.36..849.59 rows=2490 width=552)

     Sort Key: oid

     ->  Seq Scan on pg_proc  (cost=0.00..86.90 rows=2490 width=552)

  (3 rows)

  postgres=# 

That is to say: cost of seq scan + sort   > cost of 
index scan for every index entry  + cost of access for every related data ?

Maybe the database system is clever enough to  accumulate data access for same physical
page, and  reduce the times of physical
page acess ?

And can somebody kindly give  some more detailed information which help to know
the execution plan calculation process?

Thanks in advance.

postgres=# set session enable_indexscan=false;
SET
postgres=# explain SELECT * FROM pg_proc ORDER BY oid;
QUERY PLAN
-------------------------------------------------------------------
Sort (cost=843.36..849.59 rows=2490 width=552)
Sort Key: oid
-> Seq Scan on pg_proc (cost=0.00..86.90 rows=2490 width=552)
(3 rows)

postgres=#