Re: Strange runtime partition pruning behaviour with 11.4

MichaelDBA <MichaelDBA@xxxxxxxxxxx> · Sat, 3 Aug 2019 09:16:22 -0400

I too am a bit perplexed by why runtime partition pruning does not seem 
to work with this example.  Anybody got any ideas of this?

Regards,
Michael Vitale

Thomas Kellerer wrote on 8/2/2019 9:58 AM:
I stumbled across this question on SO: https://stackoverflow.com/questions/56517852

Disregarding the part about Postgres 9.3, the example for Postgres 11 looks a bit confusing.

There is a script to setup test data in that question:

==== start of script ====

     create table foo (
         foo_id integer not null,
         foo_name varchar(10),
         constraint foo_pkey primary key (foo_id)
     );

     insert into foo
       (foo_id, foo_name)
     values
       (1, 'eeny'),
       (2, 'meeny'),
       (3, 'miny'),
       (4, 'moe'),
       (5, 'tiger'),
       (6, 'toe');

     create table foo_bar_baz (
         foo_id integer not null,
         bar_id integer not null,
         baz    integer not null,
         constraint foo_bar_baz_pkey primary key (foo_id, bar_id, baz),
         constraint foo_bar_baz_fkey1 foreign key (foo_id)
             references foo (foo_id)
     ) partition by range (foo_id)
     ;

     create table if not exists foo_bar_baz_0 partition of foo_bar_baz for values from (0) to (1);
     create table if not exists foo_bar_baz_1 partition of foo_bar_baz for values from (1) to (2);
     create table if not exists foo_bar_baz_2 partition of foo_bar_baz for values from (2) to (3);
     create table if not exists foo_bar_baz_3 partition of foo_bar_baz for values from (3) to (4);
     create table if not exists foo_bar_baz_4 partition of foo_bar_baz for values from (4) to (5);
     create table if not exists foo_bar_baz_5 partition of foo_bar_baz for values from (5) to (6);

     with foos_and_bars as (
         select ((random() * 4) + 1)::int as foo_id, bar_id::int
         from generate_series(0, 1499) as t(bar_id)
     ), bazzes as (
         select baz::int
         from generate_series(1, 1500) as t(baz)
     )
     insert into foo_bar_baz (foo_id, bar_id, baz)
     select foo_id, bar_id, baz
     from bazzes as bz
       join foos_and_bars as fab on mod(bz.baz, fab.foo_id) = 0;

==== end of script ====

I see the some strange behaviour similar to to what is reported in the comments to that question:

When I run the test query immediately after populating the tables with the sample data:

     explain analyze
     select count(*)
     from foo_bar_baz as fbb
       join foo on fbb.foo_id = foo.foo_id
     where foo.foo_name = 'eeny'

I do see an "Index Only Scan .... (never executed)" in the plan for the irrelevant partitions:

   https://explain.depesz.com/s/AqlE

However once I run "analyze foo_bar_baz" (or "vacuum analyze"), Postgres chooses to do a "Parallel Seq Scan" for each partition:

   https://explain.depesz.com/s/WwxE

Why does updating the statistics mess up (runtime) partition pruning?

I played around with random_page_cost and that didn't change anything.
I tried to create extended statistics on "foo(id, name)" so that the planner would no, that there is only one name per id. No change.

I saw the above behaviour when running this on Windows 10 (my Laptop) or CentOS 7 (a test environment on a VM)

On the CentOS server default_statistics_target is set to 100, on my laptop it is set to 1000

In both cases the Postgres version was 11.4

Any ideas?

Thomas