Re: Window functions and index usage

Tom Lane <tgl@xxxxxxxxxxxxx> · Tue, 04 Oct 2011 10:36:51 -0400

=?ISO-8859-1?Q?Anssi_K=E4=E4ri=E4inen?= <anssi.kaariainen@xxxxxx> writes:
> I have the following setup:
> create table test(id integer, seq integer);
> insert into test select generate_series(0, 100), generate_series(0, 1000);
> create unique index test_idx on test(id, seq);
> analyze test;

> Now I try to fetch the latest 5 values per id, ordered by seq from the 
> table:

> select * from (
> select id, seq, row_number() over (partition by id order by seq)
>    from test
>   where id in (1, 2, 3)
> ) where row_number() <= 5;

> This does not use the index on id, seq for sorting the data. It uses a 
> bitmap index scan, and sequential scan when issued SET enable_bitmapscan 
> to false.

The cost estimates I get are 806 for bitmap scan and sort, 2097 for
seqscan and sort, 4890 for indexscan without sort.  It *can* use the
index for that query ... it just doesn't think it's a good idea.  It's
probably right, too.  At least, the actual runtimes go in the same order
on my machine.  Seqscan-and-sort very often beats an indexscan for
sorting a table, unless the table is clustered on the index or nearly so.

Note that it cannot use the index for both ordering and satisfying the
IN condition.  If it used the =ANY clause as an index condition, what
that would imply is three separate index searches and so the results
wouldn't necessarily be correctly ordered.  This is why the plain
indexscan costs out so expensive: it's a full-table scan.

			regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance