Re: SQL select query becomes slow when using limit (with no offset)

Robert Haas <robertmhaas@xxxxxxxxx> · Mon, 10 Aug 2009 12:21:29 -0400

On Mon, Aug 10, 2009 at 11:19 AM, Kevin
Grittner<Kevin.Grittner@xxxxxxxxxxxx> wrote:
> Robert Haas <robertmhaas@xxxxxxxxx> wrote:
>
>> Just handling better the case where we pick a straight nested loop
>> rather than a hash join would help a lot of people.  Some basic
>> conservatism about the number of outer rows would be useful here (in
>> particular, we should probably assume that there will be at least 2
>> when costing a nest loop, unless the outer side is known unique),
>> and it's also worth thinking about the fact that a hash join won't
>> build the table unless there is at least 1 outer row, which I don't
>> think the current costing algorithm takes into account.  Or maybe we
>> should estimate the amount by which the nested loop figures to beat
>> out the hash join and only accepted the nested loop plan if the win
>> exceeds some threshold percentage.
>
> But in our environment the most common cause of a sub-optimal planning
> choice is over-estimating the cost of a nested loop.  We've never been
> able to get good plans overall without dropping random_page_cost to
> twice the seq_page_cost or less -- in highly cached situations we
> routinely drop both to 0.1.

Even if our statistics were perfect, you'd still need to do this if
your database is mostly cached.  That's routine tuning.

> Creating further bias against nested loop
> plans just means we'll have to push the numbers further from what the
> purportedly represent.

Not at all.  You'd have to set them closer to their real values i.e.
the cost of reading a page from cache.

> It seems to me a significant unsolved problem
> is properly accounting for caching effects.

Definitely true.

> That said, I have not really clear idea on how best to solve that
> problem.  The only ideas which recur when facing these issues are:
>
> (1)  The the planner refused to deal with fractional estimates of how
> many rows will be returned in a loop -- it treats 0.01 as 1 on the
> basis that you can't read a fractional row, rather than as a 1% chance
> that you will read a row and need to do the related work.  I have
> always thought that changing this might allow more accurate estimates;
> perhaps I should hack a version which behaves that way and test it as
> a "proof of concept."  Note that this is diametrically opposed to your
> suggestion that we always assume at least two rows in the absence of a
> unique index.

You're right. The problem is that you can get hosed in both
directions.  My queries are always referencing a column that they have
a foreign key towards, so this never happens to me.  But it does
happen to other people.

> (2)  Somehow use effective_cache_size in combination with some sort of
> current activity metrics to dynamically adjust random access costs.
> (I know, that one's total hand-waving, but it seems to have some
> possibility of better modeling reality than what we currently do.)

Yeah, I gave a lightning talk on this at PGcon, but I haven't had time
to do anything with it.  There are a couple of problems.  One is that
you have to have a source for your current activity metrics.  Since a
lot of the pages of interest will be in the OS buffer pool rather than
PG shared buffers, there's no easy way to handle this, and you also
have to keep in mind that plans can be cached and reused, so you need
the estimates not to change too fast or you'll have horrible plan
stability problems.

The other is that right now, the page costs are constant.  Making them
per-relation will mean that they require syscache lookups.  I'm not
sure whether that's expensive enough to impact planning time, and if
so what to do about it.

...Robert

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance