Markus, thanks for your reply. > The pipelined top-n query has two very important properties: > (1) it utilizes the index order to avoid the sort operation required to satisfy ORDER BY clause > (2) it realizes that it can stop processing as soon as it has delivered enough rows. > > The execution plan from Thomas Kellerer sees to fulfill requirement > (1) but definitively not (2). > > Even with 9.3.2, I were not able to reproduce the result of Thomas > (not showing any sort operation in the execution plan) with the test > data I also published at my website: > http://use-the-index-luke.com/sql/example-schema/postgresql/partial-results > > Then I started fiddling around with the planner's cost settings and > finally managed to get a plan similar to Thomas' when setting > random_page_cost to 0.1 (setting it to 1, equal to seq_page_cost was > not enough). However, that proves the point that PostgreSQL can use > an index to avoid the sort operation caused by order by (even for > window functions). I'd be curious what settings caused Thomas to get > this result. Good point, I should adopt the habit to mention config settings for this kind of things: shared_buffers = 2048MB temp_buffers = 16MB work_mem = 32MB seq_page_cost = 1.0 random_page_cost = 1.5 cpu_tuple_cost = 0.001 cpu_index_tuple_cost = 0.001 effective_cache_size = 2048MB > The second requirement for pipelined top-n queries is not satisfied > in Thomas' execution plan: it does read the full index (actual > rows=1000000), and applies the window function over all rows. Only in > the end it throws away all non-confirming rows (Rows Removed by > Filter: 999899). A pipelined top-n execution would not cause more > than 300 rows read from the index, and only 200 rows removed by > filter. That's what the Oracle DB and SQL Server manage to do > (Thomas: ping me to track down why Oracle didn't for you). > Considering that PG can use the index order to avoid the sort, it > still doesn't make very much sense if it cannot abort the index scan > after fetching enough rows. So, not using the index might even be > right choice in unfiltered cases like this. I probably should have read the definition of "index usage" more carefully, thanks for the clarification. I created the testdata from your webpage locally and then ran the window function statement from http://use-the-index-luke.com/sql/example-schema/postgresql/partial-results SELECT * FROM ( SELECT sales.* , ROW_NUMBER() OVER (ORDER BY sale_date DESC , sale_id DESC) rn FROM sales ) tmp WHERE rn between 11 and 20 ORDER BY sale_date DESC, sale_id DESC; This does an "Index Scan Backward" on an index defined as (sale_date, sale_id) but takes nearly 2.5 seconds on my computer. The version using OFFSET .. LIMIT took about 0.05 seconds and increases when the offset increases which is to be expected - whereas the window function version is pretty much constant even with a startpoint of 100001 - but the offset version is still *much* faster then. Regards Thomas P.S.: Btw: thanks for your book and your webpage, both are very insipring. -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general