Re: Yet another abort-early plan disaster on 9.3

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 09/19/2014 10:15 AM, Merlin Moncure wrote:
> On Wed, Sep 17, 2014 at 7:11 PM, Josh Berkus <josh@xxxxxxxxxxxx> wrote:
>> This is the core issue with abort-early plans; they depend on our
>> statistics being extremely accurate, which we know they are not. And if
>> they're wrong, the execution time climbs by 1000X or more.  Abort-early
>> plans are inherently riskier than other types of query plans.
>>
>> What I'm not clear on is why upgrading from 9.1 to 9.3 would bring about
>> this change.  The stats are no more than 10% different across the
>> version change.
> 
> Amusingly on-topic rant I happened to read immediately after this by chance:
> 
> http://wp.sigmod.org/?p=1075
> 
> Is there a canonical case of where 'abort early' plans help? (I'm new
> to that term -- is it a recent planner innovation...got any handy
> links?)

Yeah, here's an example of the canonical case:

Table t1 ( a, b, c )

- "b" is low-cardinality
- "c" is high-cardinality
- There are separate indexes on both b and c.

SELECT a, b, c FROM t1
WHERE b = 2
ORDER BY c LIMIT 1;

In this case, the fastest plan is usually to use the index on C and
return the first row where the filter condition matches the filter on b.
 This can be an order of magnitude faster than using the index on b and
then resorting by c and taking the first row, if (b=2) happens to match
20% of the table.

This is called an "abort early" plan because we expect to never finish
the scan on the index on c.  We expect to scan the index on c, find the
first row that matches b=2 and exit.

The problem with such plans is that they are "risky".  As in, if we are
wrong about our (b=2) stats, then we've just adopted a query plan which
will be 10X to 1000X slower than the more conventional plan.

We can see this in the bad plan I posted:

 Limit  (cost=1.00..2641.10 rows=20 width=202) (actual
time=9.933..711.372 rows=20 loops=1)
   ->  Nested Loop Semi Join  (cost=1.00..9641758.39 rows=73041
width=202) (actual time=9.931..711.361 rows=20 loops=1)
         ->  Index Scan Backward using index_categories_on_recorded_on
on categories  (cost=0.43..406943.98 rows=4199200 width=202) (actual
time=0.018..275.020 rows=170995 loops=1)

Notice how the total cost of the plan is a fraction of the cost of the
two steps which preceeded it?  This is an indication that the planner
expects to be able to abort the index scan and nestloop join before it's
more than 2% through it.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance




[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux