Fwd: Estimating the execution cost of a query in a partitioned schema: Weird execution plans and totally wrong execution costs (pg_class and pg_statistic)

Nino Arsov <nino.arsov@xxxxxxxxx> · Sun, 8 Feb 2015 16:20:59 +0100

---------- Forwarded message ----------
From: Nino Arsov <nino.arsov@xxxxxxxxx>
Date: Wed, Feb 4, 2015 at 12:24 PM
Subject: Re: [ADMIN] Estimating the execution cost of a query in a partitioned schema: Weird execution plans and totally wrong execution costs (pg_class and pg_statistic)
To: Tom Lane <tgl@xxxxxxxxxxxxx>

Hi Mr. Lane,
I guess I have misinterpreted the "admin" part of the title. Is there a way for this thread to be moved to pgsql-hackers (which I think is more appropriate).

On the pg_statistic data insertion: We did have a little struggle with that, but we do that as follows;
To insert ["AB", "BC", "CD"] to stavaluesN, we use 
array_in(anyarray_out('{"AB", "BC", "CD"}'::character[]), 'character'::regtype, 1). For numeric data types, only the corresponding type is changed in the argument list.

No modifications have been done to the source code of the planner nor the parameters that can be set using SQL. I'm using a clean, "out-of-the-box" 9.3 installation under Linux.

I have done some analysis last night and it's a bit hard to draw a conclusion for the problem. I guess the large cost offset when a multi-way join is present in a query is a result of an additive error. 
I've managed to get exactly the same execution plan from the planner without having any modifications done to the planner. I've tracked the problem in the row estimation phase of each plan node. 
Namely, "Plan Rows" has a value of 1 in many nodes of the plan, like Hash Join, Seq Scan, etc. I've checked pg_statistics and pg_class and everything looks fine (as far as I can see).
I'll post an example query from Star Schema Benchmark and the associated execution plan. I've also noticed that "Plan Rows" gets to be 1 in some two-way joins besides this four-way join.

select c_nation, s_nation, d_year, sum(lo_revenue) as revenue
from customer, lineorder, supplier, dates
where lo_custkey=c_custkey 
and lo_suppkey = s_suppkey
and lo_orderdate = d_datekey
and c_region = 'ASIA' and s_region = 'ASIA'
and d_year >= 1992 and d_year <= 1997
group by c_nation, s_nation, d_year
order by d_year asc, revenue desc;

This gets executed against all valid partitions of the four tables that contain the data subjected to the constraints in the WHERE clause. You can examine the plans below:
Estimated execution plan: http://codebeautify.org/jsonviewer/fa36b7
Actual execution plan: http://codebeautify.org/jsonviewer/a370ce
Another point to take into account is the assumption of uniformity of data. I've checked the histograms of the non-partitioned tables against their partitioned counterparts and the shape of the histogram is basically the same with a small deviation of the bar height due to the range partitioning, which is inevitable.

The changes to the system catalogs remain the same as before. The only difference is the "Plan Rows" = 1 and it seems it's hard to track the problem. In the meantime, if I manage to resolve this issue, I'll post the solution here, as others might also find it useful.

Best regards,
Nino Arsov

On Tue, Feb 3, 2015 at 4:35 PM, Tom Lane <tgl@xxxxxxxxxxxxx> wrote:
Nino Arsov <nino.arsov@xxxxxxxxx> writes:

> Using EXPLAIN we have successfully managed to get very similar execution

> plans and costs by the planner as if there were data in the newly created

> partitions. This method works just fine as long as the queries to be

> estimated consist of a join of no more than 2 tables.

> When we try to estimate a query's cost that contains a join between 3 or

> more tables we get huge costs and wrong plans (in the rank of millions of

> cost units). When data is actually loaded into all of the partitions, the

> cost does not exceed a few thousand cost units.

Have you modified the planner at all?  Because it's a bit hard to see how

inserting faked-up statistics as described would work properly in a 2-way

join but not in a 3-way join.  A bug in some hack or other would be a

much more plausible explanation.

I'm also curious about exactly how you're inserting new data into

pg_statistic --- the "anyarray" columns that are used there are not

readily modifiable from SQL.

(This is pretty off-topic for pgsql-admin, btw.)

                        regards, tom lane