RE: how to calibrate the cost model parameters

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks Ron. This is what we are doing right now. we have job to analyze tables every 2 hours. However, the update statistics jobs cause the execution plan change, and query runs slowly sometimes.

I think the problem is, the “good” and “bad” query execution plan cost are close, when data changes, analyze table will update the statistics, and make the plan cost change, due to the cost model parameters is not fully optimized, it picks the “bad” execution plan.

Thanks,

James

 

Sent from Mail for Windows

 

From: Ron
Sent: Thursday, June 2, 2022 2:17 AM
To: pgsql-admin@xxxxxxxxxxxxxxxxxxxx
Subject: Re: how to calibrate the cost model parameters

 

Manually (aka through a cron script) ANALYZE those seven tables on a regular basis.  During the business day, I ANALYZE some tables every two hours.

On 5/31/22 21:20, jian xu wrote:

Thanks everyone. The reason I asked this question is, my system has a complex query, which uses CTE, then join 7 other tables. It also has where clause using json data filter. Sometimes it runs slowly(5 minutes), but sometimes it runs fast(2 seconds), (query text is same , no parameter). I know it is because data distribution change. The table is  updated frequently.

I compared the execution plan, the fast query joins other tables first, then join CTE, however, the slow question joins CTE first, then join other tables. I analyzed the tables, it didn’t work, I increased the column sampling from 100 to 10000 for all the columns used in where clause, didn’t work. I set the random io cost to 1, it didn’t work.

I am not able to modify the query. The only solution I find is

  1. use join_collapse_limit to force the join order, however, it has limitation, I need to modify the code to set it before submitting the query, and set it back to original value after query run, in case it didnt change back, other query will affect.
  2. Set both sequence and random io cost to 0.1, it can always generate correct plan.

So I suspect the cost model parameters are not fully optimized on my system, the fast and slow execution plan cost might be very close, so a little bit data change will cause the plan change, and generated a wrong plan.

Thanks

James

 

Sent from Mail for Windows

 

From: Tom Lane
Sent: Tuesday, May 31, 2022 10:44 AM
To: Laurenz Albe
Cc: Jeff Janes; jian xu; pgsql-admin@xxxxxxxxxxxxxx
Subject: Re: how to calibrate the cost model parameters

 

Laurenz Albe <laurenz.albe@xxxxxxxxxxx> writes:
> I think that the values of those parameters are more set by experience and tradition than
> by measurable physical evidence.  In a way, it just happens to work (mostly).
> For example, who says that on a spinning disk, random I/O is four times as slow as
> sequential I/O?  Very likely, you have to factor in that part of the I/O requests are
> satisfied from the kernel page cache.

The 4.0 value *is* based on some real evidence --- many years ago, I put
in a lot of machine time to get an estimate of that for non-cached fetches
on the hardware I was using at the time.  If somebody did that over again
today, they'd likely get a somewhat different value, even assuming they
were still using spinning-rust storage.  But it doesn't matter too much.
A value around that is probably fine for rotating storage, while if you
are using SSD it's likely better to use something around 1.0, and beyond
that any inaccuracies in the cost constants are usually swamped by
imperfections of the cost models and other data.  For instance, it's not
that useful to sweat about what the right value is for cpu_operator_cost
when we've never spent any serious effort on assigning plausible procost
factors to different functions.  (Worse, the whole notion that a given
function has a fixed cost independent of its input data is sadly
inadequate.)

                        regards, tom lane

 

 

--
Angular momentum makes the world go 'round.

 


[Index of Archives]     [Postgresql Home]     [Postgresql General]     [Postgresql Performance]     [Postgresql PHP]     [Postgresql Jobs]     [PHP Users]     [PHP Databases]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Databases]     [Yosemite Forum]

  Powered by Linux