On 17/12/2010, at 8:27 PM, Filip RembiaÅkowski wrote:
2010/12/17 Royce Ausburn <royce@xxxxxxxxxxx>
Hi all,
I have a table that in the typical case holds two minute sample data for a few thousand sources. Often we need to report on these data for a particular source over a particular time period and we're finding this query tends to get a bit slow.
how about (auto)vacuuming?
A key piece of information I left out: we almost never update rows in this table.
I figure at most there should only be ~20,000 rows to be read from disk, and I expect that the index is doing a pretty good job of making sure only the rows that need reading are read. inclusion of the ip in the query is almost redundant as most of the time an ip has its own collection.... My suspicion is that the rows that we're interested in are very sparsely distributed on disk, so we're having to read too many pages for the query...
you can test this suspicion in very simple way: - create test table (like yours including indexes including constraints, but with no data) - insert into test select * from yours order by
- analyze test tablee available - test the query on the new table
If new query is much faster, and if you have intensive random UPD/DEL/INS activity, periodic CLUSTER could be a good idea... but it depends on actual usage patterns (SELECT/modify ratio, types of updates, and so on).
Good idea! This vastly improves query times.
and finally, you did not specify what PostgreSQL version are you using.
In the case I've been working with it's 8.1 =( But we have a few instances of this database... I believe the rest are a mixture of 8.4s and they all have the same problem.
--Royce |