Search Postgresql Archives

Re: Best way to handle multi-billion row read-only table?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Asher wrote:
Once loaded into the database the data will never be deleted or modified and will typically be accessed over a particular date range for a particular channel (e.g. "sample_time >= X AND sample_time <= Y AND channel=Z"). A typical query won't return more than a few million rows and speed is not desperately important (as long as the time is measured in minutes rather than hours).

Are there any recommended ways to organise this? Should I partition my big table into multiple smaller ones which will always fit in memory (this would result in several hundreds or thousands of sub-tables)? Are there any ways to keep the index size to a minimum? At the moment I have a few weeks of data, about 180GB, loaded into a single table and indexed on sample_time and channel and the index takes up 180GB too.

One approach to consider is partitioning by sample_time and not even including the channel number in the index. You've got tiny records; there's going to be hundreds of channels of data on each data page pulled in, right? Why not minimize physical I/O by reducing the index size, just read that whole section of time in to memory (they should be pretty closely clustered and therefore mostly sequential I/O), and then push the filtering by channel onto the CPU instead. If you've got billions of rows, you're going to end up disk bound anyway; minimizing physical I/O and random seeking around at the expense of CPU time could be a big win.

--
Greg Smith    2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
greg@xxxxxxxxxxxxxxx  www.2ndQuadrant.com


--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux