Re: Strange runtime partition pruning behaviour with 11.4

Sverre Boschman <s.r.boschman@xxxxxxxxx> · Tue, 13 Aug 2019 09:02:32 +0200

On Tue, Aug 13, 2019 at 8:46 AM Thomas Kellerer <spam_eater@xxxxxxx> wrote:
I stumbled across this question on SO: https://stackoverflow.com/questions/56517852

Disregarding the part about Postgres 9.3, the example for Postgres 11 looks a bit confusing. 

There is a script to setup test data in that question: 

==== start of script ====

    create table foo (

        foo_id integer not null,

        foo_name varchar(10),

        constraint foo_pkey primary key (foo_id) 

    );      

    insert into foo

      (foo_id, foo_name) 

    values

      (1, 'eeny'),

      (2, 'meeny'),

      (3, 'miny'),

      (4, 'moe'),

      (5, 'tiger'), 

      (6, 'toe');

    create table foo_bar_baz (

        foo_id integer not null,

        bar_id integer not null,

        baz    integer not null,

        constraint foo_bar_baz_pkey primary key (foo_id, bar_id, baz),

        constraint foo_bar_baz_fkey1 foreign key (foo_id)

            references foo (foo_id)

    ) partition by range (foo_id) 

    ;

    create table if not exists foo_bar_baz_0 partition of foo_bar_baz for values from (0) to (1);

    create table if not exists foo_bar_baz_1 partition of foo_bar_baz for values from (1) to (2);

    create table if not exists foo_bar_baz_2 partition of foo_bar_baz for values from (2) to (3);

    create table if not exists foo_bar_baz_3 partition of foo_bar_baz for values from (3) to (4);

    create table if not exists foo_bar_baz_4 partition of foo_bar_baz for values from (4) to (5);

    create table if not exists foo_bar_baz_5 partition of foo_bar_baz for values from (5) to (6);

    with foos_and_bars as (

        select ((random() * 4) + 1)::int as foo_id, bar_id::int

        from generate_series(0, 1499) as t(bar_id)

    ), bazzes as (

        select baz::int

        from generate_series(1, 1500) as t(baz)

    )

    insert into foo_bar_baz (foo_id, bar_id, baz) 

    select foo_id, bar_id, baz 

    from bazzes as bz 

      join foos_and_bars as fab on mod(bz.baz, fab.foo_id) = 0;

==== end of script ====

I see the some strange behaviour similar to to what is reported in the comments to that question: 

When I run the test query immediately after populating the tables with the sample data:

    explain analyze 

    select count(*) 

    from foo_bar_baz as fbb 

      join foo on fbb.foo_id = foo.foo_id 

    where foo.foo_name = 'eeny'

I do see an "Index Only Scan .... (never executed)" in the plan for the irrelevant partitions: 

  https://explain.depesz.com/s/AqlE

However once I run "analyze foo_bar_baz" (or "vacuum analyze"), Postgres chooses to do a "Parallel Seq Scan" for each partition:

  https://explain.depesz.com/s/WwxE

Why does updating the statistics mess up (runtime) partition pruning? 

I played around with random_page_cost and that didn't change anything. 

I tried to create extended statistics on "foo(id, name)" so that the planner would no, that there is only one name per id. No change. 

I saw the above behaviour when running this on Windows 10 (my Laptop) or CentOS 7 (a test environment on a VM) 

On the CentOS server default_statistics_target is set to 100, on my laptop it is set to 1000

In both cases the Postgres version was 11.4

Any ideas? 

Thomas

Ran into the same behaviour of the planner. The amount of rows in the partitions influence the statistics being generated and the statistics in turn influence the plan chosen.

I managed to force the "correct" plan by manually setting the n_distinct statistics for the partitioned table.
E.g.: alter table foo_bar_baz alter column foo_id set ( n_distinct=-1, n_distinct_inherited=-1);

With a certain number of rows in the partitions the analyser sets the n_distinct value for the partitioned table to the number of unique partition keys and the n_distinct value
for the individual partitions to number of unique partition keys in that partition. Unfortunately this causes the planner to pick a plan that doesn't allow for execution pruning,
resulting in very slow execution times.

Regards,
Sverre