Re: Performance bug in prepared statement binding in 9.2?

Tom Lane <tgl@xxxxxxxxxxxxx> · Tue, 12 Nov 2013 20:46:40 -0500

Kevin Grittner <kgrittn@xxxxxxxxx> writes:
> Tom Lane <tgl@xxxxxxxxxxxxx> wrote:
>> I wonder if we could ameliorate this problem by making
>> get_actual_variable_range() use SnapshotDirty

>> In that thread I claimed that a current MVCC snapshot was the
>> most appropriate thing, which it probably is;

> If it reads from the end of the index, won't it actually be reading
> starting with the value we will obtain using SnapshotDirty, and
> since the transaction is not yet committed, won't we be making the
> trip to the heap for each index tuple we read from that point?  Why
> doesn't that make SnapshotDirty more appropriate for purposes of
> cost estimation?

This code isn't used (directly) for cost estimation.  It's used for
selectivity estimation, that is counting how many valid rows the query
will fetch when it executes.  We can't know that with certainty, of
course, so the argument boils down to how good an approximation we're
getting and how much it costs to get it.

The original coding here was SnapshotNow, which could be defended on the
grounds that it was the newest available data and thus the best possible
approximation if the query gets executed with a snapshot taken later.
That's still what's being used in the field, and of course it's got the
problem we're on about that uncommitted rows cause us to move on to the
next index entry (and thrash the ProcArrayLock for each uncommitted row
we hit :-().  In HEAD we replaced that with GetActiveSnapshot, which is
a better approximation if you assume the query will be executed with this
same snapshot, and otherwise slightly worse (because older).  But it's
just the same as far as the cost imposed by uncommitted rows goes.

My proposal to use SnapshotDirty instead is based on the thought that
it will give the same answers as SnapshotNow for definitely-live or
definitely-dead rows, while not imposing such large costs for in-doubt
rows: it will accept the first in-doubt row, and thus prevent us from
moving on to the next index entry.  The reported problems seem to involve
hundreds or thousands of in-doubt rows at the extreme of the index,
so the cost will go down proportionally to whatever that count is.
As far as the accuracy of the approximation is concerned, it's a win
if you imagine that pending inserts will have committed by the time the
query's execution snapshot is taken.  Otherwise not so much.

The other thing we might consider doing is using SnapshotAny, which
would amount to just taking the extremal index entry at face value.
This would be cheap all right (in fact, we might later be able to optimize
it to not even visit the heap).  However, I don't like the implications
for the accuracy of the approximation.  It seems quite likely that an
erroneous transaction could insert a wildly out-of-range index entry
and then roll back --- or even if it committed, somebody might soon come
along and clean up the bogus entry in a separate transaction.  If we use
SnapshotAny then we'd believe that bogus entry until it got vacuumed, not
just till it got deleted.  This is a fairly scary proposition, because
the wackier that extremal value is, the more impact it will have on the
selectivity estimates.

If it's demonstrated that SnapshotDirty doesn't reduce the estimation
costs enough to solve the performance problem the complainants are facing,
I'd be willing to consider using SnapshotAny here.  But I don't want to
take that step until it's proven that the more conservative approach
doesn't solve the performance problem.

There's an abbreviated version of this argument in the comments in
my proposed patch at
http://www.postgresql.org/message-id/11927.1384199294@xxxxxxxxxxxxx
What I'm hoping will happen next is that the complainants will hot-patch
that and see if it fixes their problems.  We can't really determine
what to do without that information.

> It doesn't look like Robert did either, if you read the whole
> message.  In fact, he also questioned why index tuples which would
> need to be read if we process from that end of the index don't
> matter for purposes of cost estimation.

The key point here is that we are not looking to estimate the cost of
getting the extremal value.  This code gets called if we are interested
in a probe value falling within the endmost histogram bin; that's not
necessarily, or even probably, the extremal value.  We're just trying to
find out if the bin endpoint has moved a lot since the last ANALYZE.

Another way of explaining this is that what Robert was really suggesting
is that we ought to account for the cost of fetching rows that are then
determined not to be visible to the query.  That's true, but it would be
a conceptual error to factor that in by supposing that they're visible.
The planner doesn't, at present, account for this cost explicitly, but I
believe it's at least somewhat handled by the fact that we charge for page
accesses on the basis of selectivity fraction times physical table size.
A table containing a lot of uncommitted/dead tuples will get charged more
I/O than a less bloated table.  Of course, this approach can't account for
special cases like "there's a whole lot of uncommitted tuples right at the
end of the index range, and not so many elsewhere".  That's below the
granularity of what we can sensibly model at the moment, I'm afraid.

			regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance