Performance with high correlation in group by on PK

Alban Hertroys <haramrae@xxxxxxxxx> · Mon, 28 Aug 2017 14:22:50 +0200

Hi all,

It's been a while since I actually got to use PG for anything serious,
but we're finally doing some experimentation @work now to see if it is
suitable for our datawarehouse. So far it's been doing well, but there
is a particular type of query I run into that I expect we will
frequently use and that's choosing a sequential scan - and I can't
fathom why.

This is on:

The query in question is:
select "VBAK_MANDT", max("VBAK_VBELN")
  from staging.etl00001_vbak
 group by "VBAK_MANDT";

This is the header-table for another detail table, and in this case
we're already seeing a seqscan. The thing is, there are 15M rows in
the table (disk usage is 15GB), while the PK is on ("VBAK_MANDT",
"VBAK_VBELN") with very few distinct values for "VBAK_MANDT" (in fact,
we only have 1 at the moment!).

Explain analyze says the following about this query:
warehouse=# explain (analyze, buffers) select "VBAK_MANDT",
max("VBAK_VBELN") from staging.etl00001_vbak group by "VBAK_MANDT";
                                                              QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=1990054.08..1990054.09 rows=1 width=36) (actual
time=38723.602..38723.602 rows=1 loops=1)
   Group Key: "VBAK_MANDT"
   Buffers: shared hit=367490 read=1409344
   ->  Seq Scan on etl00001_vbak  (cost=0.00..1918980.72 rows=14214672
width=15) (actual time=8.886..31317.283 rows=14214672 loops=1)
         Buffers: shared hit=367490 read=1409344
 Planning time: 0.126 ms
 Execution time: 38723.674 ms
(7 rows)

As you can see, a sequential scan. The statistics seem quite correct.

If I change the query to select a single value of "VBAK_MANDT" we get:
warehouse=# explain (analyze, buffers) select max("VBAK_VBELN") from
staging.etl00001_vbak where "VBAK_MANDT" = '300';

           QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Result  (cost=1.37..1.38 rows=1 width=32) (actual time=14.911..14.911
rows=1 loops=1)
   Buffers: shared hit=2 read=3
   InitPlan 1 (returns $0)
     ->  Limit  (cost=0.56..1.37 rows=1 width=11) (actual
time=14.907..14.908 rows=1 loops=1)
           Buffers: shared hit=2 read=3
           ->  Index Only Scan Backward using etl00001_vbak_pkey on
etl00001_vbak  (cost=0.56..11498362.31 rows=14214672 width=11) (actual
time=14.906..14.906 rows=1 loops=1)
                 Index Cond: (("VBAK_MANDT" = '300'::bpchar) AND
("VBAK_VBELN" IS NOT NULL))
                 Heap Fetches: 1
                 Buffers: shared hit=2 read=3
 Planning time: 0.248 ms
 Execution time: 14.945 ms
(11 rows)

That is more in line with my expectations.

Oddly enough, adding "MANDT_VBAK" and the group by back into that last
query, the result is a seqscan again.

For "VBAK_MANDT" we see these statistics:
Null fraction: 0
Average width: 4
Distinct values: 1
Most common values: {300}
Most common frequencies: {1}
Histogram bounds :
Correlation :1

The table definition is:
          Table "staging.etl00001_vbak"
       Column        |     Type      | Modifiers
---------------------+---------------+-----------
 VBAK_MANDT          | character(3)  | not null
 VBAK_VBELN          | character(10) | not null
 VBAK_ERDAT          | date          | not null
 VBAK_ERZET          | character(6)  | not null
 VBAK_ERNAM          | character(12) | not null
 VBAK_ANGDT          | date          | not null
 VBAK_BNDDT          | date          | not null
 VBAK_AUDAT          | date          | not null
...
VBAK_MULTI          | character(1)  | not null
 VBAK_SPPAYM         | character(2)  | not null
Indexes:
    "etl00001_vbak_pkey" PRIMARY KEY, btree ("VBAK_MANDT", "VBAK_VBELN")
    "idx_etl00001_vbak_erdat" btree ("VBAK_ERDAT")

A final remark: The table definition was generated by our
reporting/ETL software, hence the case-sensitive column names and the
use of the character type instead of varchar (or text).

What can be done to help the planner choose a smarter plan?
-- 
If you can't see the forest for the trees,
Cut the trees and you'll see there is no forest.

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general