Re: overzealous sorting?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 27/09/11 22:05, anthony.shipman@xxxxxxxxxxxxx wrote:

What I really want is to just read a sequence of records in timestamp order
between two timestamps. The number of records to be read may be in the
millions totalling more than 1GB of data so I'm trying to read them a slice
at a time but I can't get PG to do just this.

If I use offset and limit to grab a slice of the records from a large
timestamp range then PG will grab all of the records in the range, sort them
on disk and return just the slice I want. This is absurdly slow.

The query that I've shown is one of a sequence of queries with the timestamp
range progressing in steps of 1 hour through the timestamp range. All I want
PG to do is find the range in the index, find the matching records in the
table and return them. All of the planner's cleverness just seems to get in
the way.


It is not immediately clear that the planner is making the wrong choices here. Index scans are not always the best choice, it depends heavily on the correlation of the column concerned to the physical order of the table's heap file. I suspect the reason for the planner choosing the bitmap scan is that said correlation is low (consult pg_stats to see). Now if you think that the table's heap data is cached anyway, then this is not such an issue - but you have to tell the planner that by reducing random_page_cost (as advised previously). Give it a try and report back!

regards

Mark

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux