On Wed, Dec 2, 2009 at 10:32 AM, Laurent Laborde <kerdezixe@xxxxxxxxx> wrote: > * without order by, limit 5 : 70ms > ---------------------------------- > explain analyze SELECT * > FROM _article > WHERE (_article.bitfield && getbit(0)) > LIMIT 5; > > QUERY PLAN : > Limit (cost=0.00..20.03 rows=5 width=1109) (actual > time=70.190..70.265 rows=5 loops=1) > -> Index Scan using idx_article_bitfield on _article > (cost=0.00..69290.99 rows=17298 width=1109) (actual > time=70.188..70.260 rows=5 loops=1) > Index Cond: (bitfield && B'1'::bit varying) > Total runtime: 70.406 ms > (4 rows) > > * without order by, limit 500 (same plan as above) : 371ms > ------------------------------------------------------------------ > explain analyze SELECT * > FROM _article > WHERE (_article.bitfield && getbit(0)) > LIMIT 500; > > QUERY PLAN: > Limit (cost=0.00..2002.86 rows=500 width=1109) (actual > time=0.087..371.257 rows=500 loops=1) > -> Index Scan using idx_article_bitfield on _article > (cost=0.00..69290.99 rows=17298 width=1109) (actual > time=0.086..371.075 rows=500 loops=1) > Index Cond: (bitfield && B'1'::bit varying) > Total runtime: 371.369 ms > > * without order by, limit 5000 (query plan changed) : 1307ms > ------------------------------------------------------------------- > explain analyze SELECT * > FROM _article > WHERE (_article.bitfield && getbit(0)) > LIMIT 5000; > > QUERY PLAN : > Limit (cost=138.34..18971.86 rows=5000 width=1109) (actual > time=53.782..1307.173 rows=5000 loops=1) > -> Bitmap Heap Scan on _article (cost=138.34..65294.79 rows=17298 > width=1109) (actual time=53.781..1305.565 rows=5000 loops=1) > Recheck Cond: (bitfield && B'1'::bit varying) > -> Bitmap Index Scan on idx_article_bitfield > (cost=0.00..134.01 rows=17298 width=0) (actual time=53.606..53.606 > rows=6743 loops=1) > Index Cond: (bitfield && B'1'::bit varying) > Total runtime: 1307.972 ms > > > So... *without* "order by", differents limit and different query plan > : the queries are fast. > > * with order by, limit 5 : > ------------------------------ > explain analyze SELECT * > FROM _article > WHERE (_article.bitfield && getbit(0)) > ORDER BY _article.id ASC > LIMIT 5; > > QUERY PLAN : > Mmmm.... the query is running since 2h ... waiting, waiting. > > > * with order by, limit 500 : 546ms > ------------------------------- > explain analyze SELECT * > FROM _article > WHERE (_article.bitfield && getbit(0)) > ORDER BY _article.id ASC > LIMIT 500; > QUERY PLAN : > Limit (cost=66156.73..66157.98 rows=500 width=1109) (actual > time=545.671..545.900 rows=500 loops=1) > -> Sort (cost=66156.73..66199.98 rows=17298 width=1109) (actual > time=545.670..545.766 rows=500 loops=1) > Sort Key: id > Sort Method: top-N heapsort Memory: 603kB > -> Bitmap Heap Scan on _article (cost=138.34..65294.79 > rows=17298 width=1109) (actual time=1.059..541.359 rows=6729 loops=1) > Recheck Cond: (bitfield && B'1'::bit varying) > -> Bitmap Index Scan on idx_article_bitfield > (cost=0.00..134.01 rows=17298 width=0) (actual time=0.922..0.922 > rows=6743 loops=1) > Index Cond: (bitfield && B'1'::bit varying) > Total runtime: 546.163 ms > > > Now... with ordery by, different limit, different query plan, the > limit 5 query is insanly *SLOW* (while the limit 500 is super fast). > > What is think : The query planner do not consider the time taken by > the order by... which is *much* slower !! That is certainly not the case. If the query planner did not consider the time required to perform a sort, well, that would have been fixed a lot sooner than now. The problem real problem here is exactly what I said upthread. Without order-by, the query planner picks an index-scan or a bitmap-index-scan and just runs it until it gets enough rows to satisfy the LIMIT. No problem. With order-by, it has to make a decision: should it fetch ALL the rows that satisfy the bitfield condition, sort them by article ID, and then pick the top five? Or should it instead use the index on article ID to start retrieving the lowest-numbered article IDs and hope to find 5 that satisfy the bitfield condition before it goes through too many rows? The answer depends on how frequently the bitfield condition will be satisfied. If most rows in the table satisfy the bitfield condition, then the second plan is better; if very few do, the first plan is better. Somewhat more subtly, the plan also depends on the LIMIT. The first plan requires almost the same amount of work for a small limit as it does for a large one - you still have to find ALL the rows that match the bitfield condition and sort them. Then you return a larger or smaller number of rows from the result of the sort depending on the LIMIT. But the amount of work that the second plan requires varies dramatically depending on the LIMIT. If the LIMIT is only one-hundredth as large (5 instead of 500), then the second plan figures to have to scan only one one-hundredth as many rows, so it takes about a hundredth as much work for LIMIT 5 rather than LIMIT 500, whereas the cost of the first plan hasn't changed much. The exact break-even point between the two plans will vary depending on what percentage of the rows in the table satisfy the bitmap condition. In your particular case, the break-even point is less than one row, so the first plan is always better, but the planner doesn't know that. I don't think the planner has any statistics that can tell it how many of the rows in the table satisfy (_article.bitfield && getbit(0)), and it's probably estimating high because the actual selectivity based on the numbers you provided is quite low. That makes the second plan look appealing for small numbers of rows. If the rows that it needs are clustered at the "wrong end" of the article-ID index, which the planner certainly has no way of knowing, then things get really ugly. I've sometimes thought that the planner should outright discard plans whose total cost (ignoring the effect of LIMIT) is too many orders of magnitude more than some other available plan. But it's hard to know where to put the cutoff, and there are cases where the planner makes this kind of trade-off and gets it right which we don't want to break, so it's not a simple problem. The best solution I've found is to avoid using expressions that depend on operators other than equality whenever possible. The planner is very smart about equality. It's somewhat smart about >, <, >=, <=, <>, and pretty stupid about most other things. ...Robert -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance