Re: Table Clustering & Time Range Queries

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I'm surprised clustering as your main optimization has scaled up for you as long as it has, I normally see that approach fall apart once you're past a few hundred GB of data. You're putting a lot of work into a technique that only is useful for smaller data sets than you have now. There are two basic approaches to optimizing queries against large archives of time-series data that do scale up when you can use them:

1) Partition the tables downward until you reach a time scale where the working set fits in RAM.

2) Create materialized views that roll up the data needed for the most common reports people need run in real-time. Optimize when those run to keep overhead reasonable (which sounds possible given your comments about regular maintenance windows). Switch the app over to running against the materialized versions of any data it's possible to do so on. The two standard intros to this topic are at http://tech.jonathangardner.net/wiki/PostgreSQL/Materialized_Views and http://www.pgcon.org/2008/schedule/events/69.en.html

From what you've said about your app, I'd expect both of these would be
worth considering.

--
* Greg Smith gsmith@xxxxxxxxxxxxx http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux