On Wed, Nov 7, 2012 at 11:41 PM, 高健 <luckyjackgao@xxxxxxxxx> wrote:Hi all:
What confused me is that: When I select data using order by clause, I got the following execution plan:
postgres=# set session enable_indexscan=true;
SET
postgres=# explain SELECT * FROM pg_proc ORDER BY oid;
QUERY PLAN
----------------------------------------------------------------------------------------
Index Scan using pg_proc_oid_index on pg_proc (cost=0.00..321.60 rows=2490 width=552)
You should probably use sample cases much larger than this when trying to understand the planner. With queries this small, it almost doesn't matter what plan is chosen.
(1 row)
postgres=#
My Question is :
If I want to find record using the where clause which hold the id column, the index scan might be used.
But I just want to get all the records on sorted output format, Why index scan can be used here?
I can’t imagine that:
Step 1 Index is read into memory, then for each tuple in it,
Step 2 Then we got the address of related data block, and then access the data block .
Step 2 will be repeated for many times. I think it is not efficient.
But step 2 will repeatedly find the block it is visiting to already be in memory, so it is efficient.
Maybe the database system is clever enough to accumulate data access for same physical page, and reduce the times of physical page acess ?
There is a bitmap scan which does that, but such a scan can't be used to fulfill a sort, because it doesn't return the rows in index order. What reduces the cost here is the various levels of caching implemented by the file system, the memory system, and the CPU. PG uses "effective_cache_size" to try to account for these effects, although I admit I don't quite understand what exactly it is doing in this case. I thought that setting effective_cache_size to absurdly low values would make the index scan cost estimate go up a lot, but it only made it go up a little.Cheers,Jeff
Hi Jeff
I will try to learn about effective_cache_size .
Jian gao
2012/11/9 Jeff Janes <jeff.janes@xxxxxxxxx>