Re: Yet another abort-early plan disaster on 9.3

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Sep 17, 2014 at 7:11 PM, Josh Berkus <josh@xxxxxxxxxxxx> wrote:
> Folks,
>
> Just encountered another case of critical fail for abort-early query
> plans.  In this case, it will completely prevent a user from upgrading
> to 9.3; this is their most common query, and on 9.3 it takes 1000X longer.
>
> Maybe we should think about removing abort-early plans from 9.5?
> Clearly we don't understand them well enough for them to work for users.
>
> Query:
>
> SELECT "categories".* FROM "categories" WHERE "categories"."user_id" IN
> ( SELECT to_user_id FROM "tags" WHERE "tags"."from_user_id" = 53529975 )
> ORDER BY recorded_on DESC LIMIT 20;
>
> Here's the plan from 9.1:
>
>  Limit  (cost=1613.10..1613.15 rows=20 width=194) (actual
> time=0.503..0.509 rows=20 loops=1)
>    ->  Sort  (cost=1613.10..1736.14 rows=49215 width=194) (actual
> time=0.502..0.505 rows=20 loops=1)
>          Sort Key: categories.recorded_on
>          Sort Method: top-N heapsort  Memory: 30kB
>          ->  Nested Loop  (cost=248.80..303.51 rows=49215 width=194)
> (actual time=0.069..0.347 rows=81 loops=1)
>                ->  HashAggregate  (cost=248.80..248.81 rows=1 width=4)
> (actual time=0.050..0.054 rows=8 loops=1)
>                      ->  Index Scan using unique_index_tags on tags
> (cost=0.00..248.54 rows=103 width=4) (actual time=0.020..0.033 rows=8
> loops=1)
>                            Index Cond: (from_user_id = 53529975)
>                ->  Index Scan using index_categories_on_user_id on
> categories  (cost=0.00..54.34 rows=29 width=194) (actual
> time=0.010..0.028 rows=10 loops=8)
>                      Index Cond: (user_id = tags.to_user_id)
>  Total runtime: 0.641 ms
>
> And from 9.3:
>
>  Limit  (cost=1.00..2641.10 rows=20 width=202) (actual
> time=9.933..711.372 rows=20 loops=1)
>    ->  Nested Loop Semi Join  (cost=1.00..9641758.39 rows=73041
> width=202) (actual time=9.931..711.361 rows=20 loops=1)
>          ->  Index Scan Backward using index_categories_on_recorded_on
> on categories  (cost=0.43..406943.98 rows=4199200 width=202) (actual
> time=0.018..275.020 rows=170995 loops=1)
>          ->  Index Scan using unique_index_tags on tags
> (cost=0.57..2.20 rows=1 width=4) (actual time=0.002..0.002 rows=0
> loops=170995)
>                Index Cond: ((from_user_id = 53529975) AND (to_user_id =
> categories.user_id))
>  Total runtime: 711.457 ms
>
> So, here's what's happening here:
>
> As usual, PostgreSQL is dramatically undercounting n_distinct: it shows
> chapters.user_id at 146,000 and the ratio of to_user_id:from_user_id as
> being 1:105 (as opposed to 1:6, which is about the real ratio).  This
> means that PostgreSQL thinks it can find the 20 rows within the first 2%
> of the index ... whereas it actually needs to scan 50% of the index to
> find them.
>
> Removing LIMIT causes 9.3 to revert to the "good" plan, as expected.
>
> This is the core issue with abort-early plans; they depend on our
> statistics being extremely accurate, which we know they are not. And if
> they're wrong, the execution time climbs by 1000X or more.  Abort-early
> plans are inherently riskier than other types of query plans.
>
> What I'm not clear on is why upgrading from 9.1 to 9.3 would bring about
> this change.  The stats are no more than 10% different across the
> version change.

Amusingly on-topic rant I happened to read immediately after this by chance:

http://wp.sigmod.org/?p=1075

Is there a canonical case of where 'abort early' plans help? (I'm new
to that term -- is it a recent planner innovation...got any handy
links?)

merlin


-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance




[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux