Re: Query with high planning time at version 11.1 compared versions 10.5 and 11.0

Amit Langote <Langote_Amit_f8@xxxxxxxxxxxxx> · Thu, 6 Dec 2018 11:14:06 +0900

Hi,

On 2018/12/05 6:55, Alvaro Herrera wrote:
> On 2018-Dec-04, Alvaro Herrera wrote:
> 
>> CREATE TABLE precio(fecha timestamp, pluid int, loccd int, plusalesprice int) PARTITION BY RANGE (fecha); 
>> SELECT format('CREATE TABLE public.precio_%s PARTITION OF public.precio (PRIMARY KEY (fecha, pluid, loccd) ) FOR VALUES FROM (''%s'')TO(''%s'')', i, a, b) FROM (SELECT '1990-01-01'::timestam p+(i||'days')::interval a, '1990-01-02'::timestamp+(i||'days')::interval b, i FROM generate_series(1,999) i)x \gexec
> 
> Actually, the primary keys are not needed; it's just as slow without
> them.

I ran the original unmodified query at [1] (the one that produces an empty
plan due to all children being pruned) against the server built with
patches I posted on the "speeding up planning with partitions" [2] thread
and it finished in a jiffy.

explain SELECT l_variacao.fecha, l_variacao.loccd , l_variacao.pant ,
l_variacao.patual , max_variacao.var_max FROM (SELECT p.fecha, p.loccd,
p.plusalesprice patual, da.plusalesprice pant, abs(p.plusalesprice -
da.plusalesprice) as var from precio p, (SELECT p.fecha, p.plusalesprice,
p.loccd from precio p WHERE p.fecha between '2017-03-01' and '2017-03-02'
and p.pluid = 2) da WHERE p.fecha between '2017-03-01' and '2017-03-02'
and p.pluid = 2 and p.loccd = da.loccd and p.fecha = da.fecha) l_variacao,
(SELECT max(abs(p.plusalesprice - da.plusalesprice)) as var_max from
precio p, (SELECT p.fecha, p.plusalesprice, p.loccd from precio p WHERE
p.fecha between '2017-03-01' and '2017-03-02' and p.pluid = 2) da WHERE
p.fecha between '2017-03-01' and '2017-03-02' and p.pluid = 2 and p.loccd
= da.loccd and p.fecha = da.fecha) max_variacao WHERE max_variacao.var_max
= l_variacao.var;
QUERY PLAN
───────────────────────────────────────────
 Result  (cost=0.00..0.00 rows=0 width=24)
   One-Time Filter: false
(2 rows)

Time: 50.792 ms

That's because one of the things changed by one of the patches is that
child EC members are added only for the non-dummy children.  In this case,
since all the children are pruned, there should be zero child EC members,
which is what would happen in PG 10 too.  The partitionwise join related
changes in PG 11 moved the add_child_rel_equivalences call in
set_append_rel_size such that child EC members would be added even before
checking if the child rel is dummy, but for a reason named in the comment
above the call:

   ... Even if this child is
 * deemed dummy, it may fall on nullable side in a child-join, which
 * in turn may participate in a MergeAppend, where we will need the
 * EquivalenceClass data structures.

However, I think we can skip adding the dummy child EC members here  and
instead make it a responsibility of partitionwise join code in joinrels.c
to add the needed EC members.  Attached a patch to show what I mean, which
passes the tests and gives this planning time:

                            QUERY PLAN
───────────────────────────────────────────────────────────────────
 Result  (cost=0.00..0.00 rows=0 width=24) (actual rows=0 loops=1)
   One-Time Filter: false
 Planning Time: 512.788 ms
 Execution Time: 0.162 ms

which is not as low as with the patches at [2] for obvious reasons, but as
low as we can hope to get with PG 11.  Sadly, planning time is less with
PG 10.6:

                            QUERY PLAN
───────────────────────────────────────────────────────────────────
 Result  (cost=0.00..0.00 rows=0 width=24) (actual rows=0 loops=1)
   One-Time Filter: false
 Planning time: 254.533 ms
 Execution time: 0.080 ms
(4 rows)

But I haven't looked closely at what else in PG 11 makes the planning time
twice that of 10.

> I noticed another interesting thing, which is that if I modify the query
> to actually reference some partition that I do have (as opposed to the
> above, which just takes 30s to prune everything) the plan is mighty
> curious ... if only because in one of the Append nodes, partitions have
> not been pruned as they should.
>
> So, at least two bugs here,
> 1. the equivalence-class related slowness,
> 2. the lack of pruning

I haven't reproduced 2 yet.  Can you share the modified query?

Thanks,
Amit

[1]
https://www.postgresql.org/message-id/20181128004402.GC30707%40telsasoft.com
From 67d051526138b00b7429996655a403535ea6d75c Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@xxxxxxxxx>
Date: Thu, 6 Dec 2018 10:38:25 +0900
Subject: [PATCH] Add child EC members for only the non-dummy children

In set_append_rel_size, child EC members are added before checking
if the child relation itself is dummy.  That's very inefficient if
there are lots of children as the EC lists grow huge.  Such EC
members would remain unused unless the child participates in
partitionwise join, so add them only if the child "actually"
participates in partitionwise join.
---
 src/backend/optimizer/path/allpaths.c | 27 ++++++++++++---------------
 src/backend/optimizer/path/joinrels.c | 19 +++++++++++++++++++
 2 files changed, 31 insertions(+), 15 deletions(-)

diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 738bb30848..2b5fef2233 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -1034,21 +1034,6 @@ set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
 								   1, &appinfo);
 
 		/*
-		 * We have to make child entries in the EquivalenceClass data
-		 * structures as well.  This is needed either if the parent
-		 * participates in some eclass joins (because we will want to consider
-		 * inner-indexscan joins on the individual children) or if the parent
-		 * has useful pathkeys (because we should try to build MergeAppend
-		 * paths that produce those sort orderings). Even if this child is
-		 * deemed dummy, it may fall on nullable side in a child-join, which
-		 * in turn may participate in a MergeAppend, where we will need the
-		 * EquivalenceClass data structures.
-		 */
-		if (rel->has_eclass_joins || has_useful_pathkeys(root, rel))
-			add_child_rel_equivalences(root, appinfo, rel, childrel);
-		childrel->has_eclass_joins = rel->has_eclass_joins;
-
-		/*
 		 * We have to copy the parent's quals to the child, with appropriate
 		 * substitution of variables.  However, only the baserestrictinfo
 		 * quals are needed before we can check for constraint exclusion; so
@@ -1186,6 +1171,18 @@ set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
 			continue;
 		}
 
+		/*
+		 * We have to make child entries in the EquivalenceClass data
+		 * structures as well.  This is needed either if the parent
+		 * participates in some eclass joins (because we will want to consider
+		 * inner-indexscan joins on the individual children) or if the parent
+		 * has useful pathkeys (because we should try to build MergeAppend
+		 * paths that produce those sort orderings).
+		 */
+		if (rel->has_eclass_joins || has_useful_pathkeys(root, rel))
+			add_child_rel_equivalences(root, appinfo, rel, childrel);
+		childrel->has_eclass_joins = rel->has_eclass_joins;
+
 		/* CE failed, so finish copying/modifying join quals. */
 		childrel->joininfo = (List *)
 			adjust_appendrel_attrs(root,
diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c
index d3d21fed5d..631468491c 100644
--- a/src/backend/optimizer/path/joinrels.c
+++ b/src/backend/optimizer/path/joinrels.c
@@ -1376,6 +1376,25 @@ try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
 		AppendRelInfo **appinfos;
 		int			nappinfos;
 
+		/*
+		 * set_append_rel_size() doesn't add EC members for dummy child
+		 * relations.  Do that now, because downstream code relies on child
+		 * EC members having been added.
+		 */
+		if (IS_SIMPLE_REL(child_rel1) && IS_DUMMY_REL(child_rel1))
+		{
+			AppendRelInfo *appinfo = root->append_rel_array[child_rel1->relid];
+
+			add_child_rel_equivalences(root, appinfo, rel1, child_rel1);
+		}
+
+		if (IS_SIMPLE_REL(child_rel2) && IS_DUMMY_REL(child_rel2))
+		{
+			AppendRelInfo *appinfo = root->append_rel_array[child_rel2->relid];
+
+			add_child_rel_equivalences(root, appinfo, rel2, child_rel2);
+		}
+
 		/* We should never try to join two overlapping sets of rels. */
 		Assert(!bms_overlap(child_rel1->relids, child_rel2->relids));
 		child_joinrelids = bms_union(child_rel1->relids, child_rel2->relids);
-- 
2.11.0