Re: Cost estimate vs. actual - do I care?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



3), 2), 1).

The planner needs the right information to make the right decision. However, the planner rarely has perfect information, so the algorithms need to be able to cope with some amount of imperfection while still generally making the right decision. There are also a limited (relatively) set of possible plans and the plans often enough have different enough characteristics that the truly different plans (i.e. factor of 10 difference in terms of run time) won't be selected by accident, even with fairly bad estimates.

It probably depends mostly on your data set. For many data sets, the estimates might be possible to be off by a factor of 10 and still come up with the same "right" plan. For many other data sets, for example, the difference between an index scan or a sequential scan could result in performance differences by a factor of 10 or more and picking the "right one" is incredibly important.

If you are worried, you can test your theory with "set enable_seqscan=0" and other such things to see what alternative plans might be generated and whether they are indeed better or not. Generally, I only worry about it for queries that I know to be bottlenecks, and many of the times - the time investment I make into trying to prove that a better query is possible ends up only educating me on why I am wrong... :-)

The majority of the time for me, anyways, I don't find that the estimates are that bad or that the planner is wrong. It's usually the typical scenario where somebody added a query but forgot to make sure the query was efficient by ensuring that the indexes properly accelerate their query.

Good luck.

mark


On 01/01/2012 01:59 PM, Jay Levitt wrote:
I gather that a big part of making queries performant is making sure the planner's estimates reflect reality.

Given a random explain analyze line:

Limit (cost=0.02..943.11 rows=50 width=16) (actual time=0.603..79.729 rows=50 loops=1)

which is the truer statement?

1. As long as costs go up with actual time, you're fine.

2. You should try to ensure that costs go up linearly with actual time.

3. You should try to ensure that costs are as close as possible to actual time.

4. The number "4".

Jay Levitt



--
Mark Mielke<mark@xxxxxxxxx>


--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux