Re: Cost of sort/order by not estimated by the query planner

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



hummm.... Adding pgsql-perf :)

On Mon, Nov 30, 2009 at 5:54 PM, Laurent Laborde <kerdezixe@xxxxxxxxx> wrote:
> Friendly greetings !
> I use postgresql 8.3.6.
>
> here is a few info about the table i'm querying :
> -------------------------------------------------------------
> - select count(*) from _article : 17301610
> - select count(*) from _article WHERE (_article.bitfield && getbit(0)) : 6729
>
>
> Here are both request with problems :
> --------------------------------------------------
>
> QUERY 1 : Very fast !
> -------------
>
> explain SELECT *
> FROM   _article
> WHERE (_article.bitfield && getbit(0))
> ORDER BY _article.id ASC
> LIMIT 500;
>                                             QUERY PLAN
> -----------------------------------------------------------------------------------------------------
>  Limit  (cost=66114.13..66115.38 rows=500 width=1114)
>   ->  Sort  (cost=66114.13..66157.37 rows=17296 width=1114)
>         Sort Key: id
>         ->  Bitmap Heap Scan on _article  (cost=138.32..65252.29
> rows=17296 width=1114)
>               Recheck Cond: (bitfield && B'1'::bit varying)
>               ->  Bitmap Index Scan on idx_article_bitfield
> (cost=0.00..134.00 rows=17296 width=0)
>                     Index Cond: (bitfield && B'1'::bit varying)
>
>
>
>
> QUERY 2 : Endless ... (more than 30mn... i stopped the query)
> -------------
>
> explain SELECT *
> FROM   _article
> WHERE (_article.bitfield && getbit(0))
> ORDER BY _article.id ASC
> LIMIT 5;
>                                           QUERY PLAN
> -------------------------------------------------------------------------------------------------
>  Limit  (cost=0.00..2042.87 rows=5 width=1114)
>   ->  Index Scan using _article_pkey on _article
> (cost=0.00..7066684.46 rows=17296 width=1114)
>         Filter: (bitfield && B'1'::bit varying)
> (3 rows)
>
>
> With LIMIT 5 and LIMIT 500, the query plan are differents.
> Postgresql estimate that it can do a a simple index scan to find only 5 row.
> With more than LIMIT ~400 it estimate that it's faster to do a more
> complex plan.
> and it make sense !
>
> The problem is in the order by, of course.
> If i remove the "order by" the LIMIT 5 is faster (0.044 ms) and do an
> index scan.
> At limit 500 (without order) it still use an index scan and it is
> slightly slower.
> At limit 5000 (without order) it switch to a Bitmap Index Scan +
> Bitmap Heap Scan and it's slower but acceptable (5.275 ms)
>
> Why, with the "QUERY 2", postgresql doesn't estimate the cost of the
> Sort/ORDER BY ?
> Of course, by ignoring the order, both query plan are right and the
> choice for thoses differents plans totally make sense.
>
> But... if the planner would be kind enough to considerate the cost of
> the order by, it would certainly choose the Bitmap Index + Bitmap Heap
> scan for the limit 5.
> And not an index_scan pkey !
>
> I have set the statistics to 1000 for _article.bitfield, just in case
> (and ran a vacuum analyze), it doesn't change anything.
>
> Is that a bug ? any Idea ?
>
> Thank you :)
>
> --
> Laurent "ker2x" Laborde
> Sysadmin & DBA at http://www.over-blog.com/
>



-- 
Laurent "ker2x" Laborde
Sysadmin & DBA at http://www.over-blog.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux