Search Postgresql Archives

Re: Performance with high correlation in group by on PK

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Aug 28, 2017 at 5:22 AM, Alban Hertroys <haramrae@xxxxxxxxx> wrote:
Hi all,

It's been a while since I actually got to use PG for anything serious,
but we're finally doing some experimentation @work now to see if it is
suitable for our datawarehouse. So far it's been doing well, but there
is a particular type of query I run into that I expect we will
frequently use and that's choosing a sequential scan - and I can't
fathom why.

This is on:


The query in question is:
select "VBAK_MANDT", max("VBAK_VBELN")
  from staging.etl00001_vbak
 group by "VBAK_MANDT";

This is the header-table for another detail table, and in this case
we're already seeing a seqscan. The thing is, there are 15M rows in
the table (disk usage is 15GB), while the PK is on ("VBAK_MANDT",
"VBAK_VBELN") with very few distinct values for "VBAK_MANDT" (in fact,
we only have 1 at the moment!).

You need an "index skip-scan" or "loose index scan".  PostgreSQL doesn't currently detect and implement them automatically, but you can use a recursive CTE to get it to work.  There are some examples at https://wiki.postgresql.org/wiki/Loose_indexscan

Cheers,

Jeff

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux