Re: Window functions and index usage

Anssi Kääriäinen <anssi.kaariainen@xxxxxx> · Tue, 4 Oct 2011 18:00:55 +0300

On 10/04/2011 05:36 PM, Tom Lane wrote:
The cost estimates I get are 806 for bitmap scan and sort, 2097 for
seqscan and sort, 4890 for indexscan without sort.  It *can* use the
index for that query ... it just doesn't think it's a good idea.  It's
probably right, too.  At least, the actual runtimes go in the same order
on my machine.  Seqscan-and-sort very often beats an indexscan for
sorting a table, unless the table is clustered on the index or nearly so.
I tested it and yes, it can use the index scan. But not in the way I 
though it would be used.
Note that it cannot use the index for both ordering and satisfying the
IN condition.  If it used the =ANY clause as an index condition, what
that would imply is three separate index searches and so the results
wouldn't necessarily be correctly ordered.  This is why the plain
indexscan costs out so expensive: it's a full-table scan.
This I don't understand. I would imagine it would be possible to execute 
this query as get 5 first values for id 1, get 5 first values for id 2, 
get 5 first values for id 3. At least if I do this by hand using UNION I 
get two orders of magnitude faster execution time. I am not saying it 
would be easy to do that, but to me it seems it would be possible to use 
the index more efficiently for the example query. Or is the following 
UNION query not equivalent to the window function query, assuming I am 
not interested in the row_number column itself?

(select id, seq from test where id = 1 order by seq limit 5)
union
(select id, seq from test where id = 2 order by seq limit 5)
union
(select id, seq from test where id = 3 order by seq limit 5);

The results are in different order, but there is no order by in the 
original query except in the OVER clause, so it should not matter.

 - Anssi Kääriäinen

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance