Re: Dynamlically updating the estimated cost of a transaction

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hasini Gunasinghe wrote:

I am looking for a way to let the user know what the estimated time for the current transaction he requested and while the transaction is in progress, how much time is elapsed for the transaction as a fraction of the total estimated time at a particular instance, by dynamically estimating the time for the transaction at that instance.
I think this one needs to get added to the FAQ. To re-use from when I answered this last month: this just isn't exposed in PostgreSQL yet. Clients ask for queries to be run, eventually they get rows of results back, but there's no notion of how many they're going to get in advance or how far along they are in executing the query's execution plan. There's a couple of academic projects that have started exposing more of the query internals, but I'm not aware of anyone who's even started moving in the direction of what you'd need to produce a progress bar or estimate a total run-time. It's a hard problem--you could easily spend several years of your life on this alone and still not have even a mediocre way to predict how much time is left to execute a generic query.

In practice, people tend to save query log files showing historical information about how long queries took to run, and then use that to predict future response times. That's a much easier way to get something useful for a lot of applications than expecting you can ever estimate just based on an EXPLAIN plan.

I got to know how Prostgre estimates the cost for a particular operation. Does estimated cost means the estimated time to evaluate an operation in the context of Postgre? And also may I know if there is any way to achieve the requirement I mentioned above, with the Postgre SQL?
Estimated costs are not ever used to predict an estimated time. An interesting research project would be trying to tie the two together more tightly, by collecting a bunch of data measuring real EXPLAIN ANALYZE execution times with their respective cost estimates.

Even after you collected it, actually using the data from such research is quite tricky. For example, people have tried to tie some of the individual cost components to the real world--for example, measuring the true amount of time it takes to do a sequential read vs. a seek and adjusting random_page_cost accordingly. But if you then set random_page_cost to its real-world value based on that estimate, you get a value well outside what seems to work for people in practice. This suggests the underlying cost estimate doesn't reflect the real-world value it intends to that closely. But improving on that situation without going backwards in the quality of the plans the query optimizer produces is a tricky problem.

--
Greg Smith    2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
greg@xxxxxxxxxxxxxxx  www.2ndQuadrant.com


--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux