Search Postgresql Archives

Re: Best way to handle multi-billion row read-only table?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



John R Pierce wrote:
how do you plan on accessing this monster data? do you expect to be looking up single values or small set of values at a specific time? seems to me like this is the sort of data thats more often processed in the aggregate, like running a fourier analysis of sliding windows, and that sort of data processing may well be more efficiently done with fixed block binary files rather than relational databases, as there's no real relationships in this data.

The data will initially be accessed via a simple GUI which will allow browsing over a subset of the data (subsampled down to 1 sample/minute/hour, etc. during the data load phase and so massively smaller datasets) and then once something "interesting" has been found manually (fully automatic detection of transients has to wait until we've formally described what we mean by "transient" :-)) the start and end times can be handed over to our automatic processing code to go through the full dataset.

I did consider just sticking the data into a series of big dumb files but by putting them in a DB I can both maintain automatic links between the full and subsampled data sets and between each data point and the equipment that measured it and, possibly more importantly, I can provide a simpler interface to the other people on my project to access the data. I'm a computer scientist but I'm doing my PhD in the Civil Engineering dept and all of my colleagues are civil engineers - all quite happy using Matlab's database plugin but less happy writing traditional code to crunch through raw files. I'm aware that I'm taking a, possibly quite large, performance hit by using a database but I'm hoping that the advantages will outweigh this.

Many thanks for all the replies to my query. I'm going to go with a partitioned table design and start uploading some data. I'll post how it performs once I've got some real size data in it.


Asher.


--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux