Re: [EXT] Re: Improve "select count(*)" query - takes more than 30 mins for some large tables

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 7/12/22 14:25, MichaelDBA Vitale wrote:
That is not true: doing the select on the primary key will still result in a table scan, not an index scan.  The heap always gets accessed for select counts.

Regards,
Michael Vitale

Well, not always:

mgogala=# explain select count(*) from emp;
                        QUERY PLAN                        
----------------------------------------------------------
 Aggregate  (cost=1.18..1.19 rows=1 width=8)
   ->  Seq Scan on emp  (cost=0.00..1.14 rows=14 width=0)
(2 rows)

That is the classic "select count(*)".  Now, let's see PK:

mgogala=# \d emp
                           Table "mgogala.emp"
  Column  |            Type             | Collation | Nullable | Default
----------+-----------------------------+-----------+----------+---------
 empno    | smallint                    |           | not null |
 ename    | character varying(10)       |           |          |
 job      | character varying(9)        |           |          |
 mgr      | smallint                    |           |          |
 hiredate | timestamp without time zone |           |          |
 sal      | double precision            |           |          |
 comm     | double precision            |           |          |
 deptno   | smallint                    |           |          |
Indexes:
    "emp_pkey" PRIMARY KEY, btree (empno)
Foreign-key constraints:
    "fk_deptno" FOREIGN KEY (deptno) REFERENCES dept(deptno)


Now, let's do scan on "count(empno)":

mgogala=# explain select count(empno) from emp;
                        QUERY PLAN                        
----------------------------------------------------------
 Aggregate  (cost=1.18..1.19 rows=1 width=8)
   ->  Seq Scan on emp  (cost=0.00..1.14 rows=14 width=2)
(2 rows)

So far, so good. Sequential scan, despite the existence of the NOT NULL primary key. Now comes the twist:

mgogala=# set enable_seqscan=off;
SET
mgogala=# explain select count(empno) from emp;
                                   QUERY PLAN                                   
 
--------------------------------------------------------------------------------
-
 Aggregate  (cost=12.38..12.39 rows=1 width=8)
   ->  Index Only Scan using emp_pkey on emp  (cost=0.14..12.35 rows=14 width=2)
(2 rows)

Voila, we've got index only "count" scan. So, in some cases Postgres does do index only scan. What is surprising is that pg_hint_plan doesn't produce the desired results:

mgogala=# explain /*+ IndexScan(emp emp_pkey) */ select count(empno) from emp;
                                  QUERY PLAN                                   
-------------------------------------------------------------------------------
 Aggregate  (cost=10000000001.17..10000000001.18 rows=1 width=8)
   ->  Seq Scan on emp  (cost=10000000000.00..10000000001.14 rows=14 width=2)
 JIT:
   Functions: 3
   Options: Inlining true, Optimization true, Expressions true, Deforming true
(5 rows)

This is strange in more ways than one:

  1. The hint didn't work, Postgres still performed sequential scan
  2. JIT got involved for the reasons that are completely unclear.
  3. The cost is enormous, despite the fact the the "emp" table has only 14 rows and is fully contained in a single block.

However, if we do GUC hint, the result behaves as expected:

mgogala=# explain /*+ set(enable_seqscan off) */ select count(empno) from emp e;
                                    QUERY PLAN                                  
   
--------------------------------------------------------------------------------
---
 Aggregate  (cost=12.38..12.39 rows=1 width=8)
   ->  Index Only Scan using emp_pkey on emp e  (cost=0.14..12.35 rows=14 width=2)
(2 rows)

So, it is possible to force index only scan. However, from purely theoretic point of view, the speed difference is questionable. Indexes are stored differently than tables, they have branch blocks and data bocks with an ample free space in the data blocks to prevent block splits and index level increase, with mandatory re-balancing. Indexes are actually quite large structures, sometimes even larger than the underlying tables. In general, going after the index-only scan doesn't look like a good strategy.

After all this work, I also have a question: does PostgreSQL read index blocks in batches or it does single block scans? Some other databases have 2 events: "db file scattered read" for sequential scan (called "full table scan" in that specific vernacular) and "db file sequential read" for index scans. Those database only do single block reads from indexes while the tables can be read using batches of blocks. Does PostgreSQL do the same thing?

-- 
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

[Index of Archives]     [Postgresql Home]     [Postgresql General]     [Postgresql Performance]     [Postgresql PHP]     [Postgresql Jobs]     [PHP Users]     [PHP Databases]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Databases]     [Yosemite Forum]

  Powered by Linux