Thanks Ron. This is what we are doing right now. we have job to analyze tables every 2 hours. However, the update statistics jobs cause the execution plan change, and query runs slowly sometimes.
I think the problem is, the “good” and “bad” query execution plan cost are close, when data changes, analyze table will update the statistics, and make the plan cost change, due to the cost model parameters is not fully optimized, it picks
the “bad” execution plan.
Thanks,
James
Sent from
Mail for Windows
Manually (aka through a cron script) ANALYZE those seven tables on a regular basis. During the business day, I ANALYZE some tables
every two hours.
On 5/31/22 21:20, jian xu wrote:
Thanks everyone. The reason I asked this question is, my system has a complex query, which uses CTE, then join 7 other tables. It also has where clause using json data filter. Sometimes it runs slowly(5 minutes), but sometimes it runs fast(2
seconds), (query text is same , no parameter). I know it is because data distribution change. The table is updated frequently.
I compared the execution plan, the fast query joins other tables first, then join CTE, however, the slow question joins CTE first, then join other tables. I analyzed the tables, it didn’t work, I increased the column sampling from 100 to
10000 for all the columns used in where clause, didn’t work. I set the random io cost to 1, it didn’t work.
I am not able to modify the query. The only solution I find is
- use
join_collapse_limit
to force the join order, however, it has limitation, I need to modify the code to set it before
submitting the query, and set it back to original value after query run, in case it didn’t change back, other query will affect.
- Set both sequence and random io cost to 0.1, it can always generate correct plan.
So I suspect the cost model parameters are not fully optimized on my system, the fast and slow execution plan cost might be very close, so a little bit data change will cause the plan change, and generated a wrong plan.
Thanks
James
Sent from
Mail for Windows
Laurenz Albe <laurenz.albe@xxxxxxxxxxx> writes:
> I think that the values of those parameters are more set by experience and tradition than
> by measurable physical evidence. In a way, it just happens to work (mostly).
> For example, who says that on a spinning disk, random I/O is four times as slow as
> sequential I/O? Very likely, you have to factor in that part of the I/O requests are
> satisfied from the kernel page cache.
The 4.0 value *is* based on some real evidence --- many years ago, I put
in a lot of machine time to get an estimate of that for non-cached fetches
on the hardware I was using at the time. If somebody did that over again
today, they'd likely get a somewhat different value, even assuming they
were still using spinning-rust storage. But it doesn't matter too much.
A value around that is probably fine for rotating storage, while if you
are using SSD it's likely better to use something around 1.0, and beyond
that any inaccuracies in the cost constants are usually swamped by
imperfections of the cost models and other data. For instance, it's not
that useful to sweat about what the right value is for cpu_operator_cost
when we've never spent any serious effort on assigning plausible procost
factors to different functions. (Worse, the whole notion that a given
function has a fixed cost independent of its input data is sadly
inadequate.)
regards, tom lane
--
Angular momentum makes the world go 'round.