Rikard Pavelic <rikard@xxxxxx> writes: > I was investigating some performance issues and stumbled upon this behavior: > create table main_table (i serial primary key, data varchar, ord int); > create view main_view_order as select m.i, m.data, m.ord from main_table m order by m.i desc; > insert into main_table select i, i::text, i/10 from generate_series(1,1000000) i; > create index ix_ord on main_table(ord); > analyze main_table; > explain analyze select * from main_view_order m where m.ord >= 5000 and m.ord <= 5500 limit 10; > Limit (cost=0.00..69.01 rows=10 width=14) (actual time=330.943..330.951 rows=10 loops=1) > -> Index Scan Backward using main_table_pkey on main_table m (cost=0.00..36389.36 rows=5281 width=14) (actual time=330.937..330.940 rows=10 loops=1) > Filter: ((ord >= 5000) AND (ord <= 5500)) > Total runtime: 330.975 ms > I havent found it on TODO or in archives so I'm wondering if this is a known behavior. There is nothing particularly wrong with that plan, or at least I'd not recommend holding your breath waiting for it to get better. Given this scenario, there are two possible (index-based) plans. The one above scans the pkey index in decreasing order, reports out only the rows satisfying the "ord" condition, and stops as soon as it has 10 rows. The only other alternative is to scan the ord index to collect the 5000 or so rows satisfying the "ord" condition, sort them all by "i", and then throw away 4990 of them and return just the first 10. The planner realizes that about 1/200th of the table satisfies the "ord" condition, so it estimates that the first plan will require scanning about 2000 entries in the pkey index to get 10 results. So that looks significantly cheaper than the other plan, which would require 5000 index fetches, not to mention a sort step. Now, in this artificial test case, the cost estimate is wrong because "i" and "ord" are perfectly correlated and all the desired rows are quite far down the descending-i index scan; so the chosen plan actually has to scan a lot more than 2000 index entries. In a more realistic case that plan would probably work noticeably better. However, the planner has no statistics that would tell it about the degree of order correlation of the two columns, so it's not able to find that out. Having said all that, neither of these plan choices are exactly ideal; the planner is basically reduced to having to guess which one will suck less. You might try experimenting with two-column indexes on (i, ord) or (ord, i) to give the planner some other cards to play. I'm not sure how much that would help in this exact query type, but for example the common case of "where x = constant order by y" is perfectly matched to a btree index on (x, y). regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance