On Fri, Dec 15, 2006 at 09:56:57AM -0500, Tom Lane wrote: > Martijn van Oosterhout <kleptog@xxxxxxxxx> writes: > > On Fri, Dec 15, 2006 at 12:20:46PM +0000, Simon Riggs wrote: > >> Maybe sampling every 10 rows will bring things down to an acceptable > >> level (after the first N). You tried less than 10 didn't you? > > > Yeah, it reduced the number of calls as the count got larger. It broke > > somewhere, though I don't quite remember why. > > The fundamental problem with it was the assumption that different > executions of a plan node will have the same timing. That's not true, > in fact not even approximately true. IIRC the patch did realize > that first-time-through is not a predictor for the rest, but some of > our plan nodes have enormous variance even after the first time. > I think the worst case is batched hash joins. It didn't assume that because that's obviously bogus. It assumed the durations would be spread as a normal distribution. Which meant that over time the average of the measured iterations would approch the actual average. It tried to take enough measurements to try and keep expected error small, but it's statistics, you can only say "this will give the right answer >95% of the time". You are correct though, the error was caused by unexpectedly large variations, or more likely, an unexpected distribution curve. Statistically, we took enough samples to not be affected significantly by large variations. Even if it looked more like a gamma distribution it should not have been as far off as it was. Looking at alternative approaches, like sampling with a timer, you end up with the same problem: sometimes the calculations will fail and produce something strange. The simplest example being than a 100Hz timer is not going to produce any useful information for queries in the millisecond range. A higher frequency timer than that is not going to be available portably. You could probably throw more effort into refining the statistics behind it, but at some point we're going to have to draw a line and say: it's going to be wrong X% of the time, deal with it. If we're not willing to say that, then there's no point going ahead with any statistical approach. Have a nice day, -- Martijn van Oosterhout <kleptog@xxxxxxxxx> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Attachment:
signature.asc
Description: Digital signature