Re: Storing sensor data

Alexander Staubo <alex@xxxxxxxxxx> · Thu, 28 May 2009 15:39:53 +0200

On Thu, May 28, 2009 at 2:54 PM, Ivan Voras <ivoras@xxxxxxxxxxx> wrote:
> The volume of sensor data is potentially huge, on the order of 500,000
> updates per hour. Sensor data is few numeric(15,5) numbers.

The size of that dataset, combined with the apparent simplicity of
your schema and the apparent requirement for most-sequential access
(I'm guessing about the latter two), all lead me to suspect you would
be happier with something other than a traditional relational
database.

I don't know how exact your historical data has to be. Could you get
by with something like RRDTool? RRdTool is a round-robin database that
stores multiple levels of historical values aggregated by function. So
you typically create an "average" database, a "max" database and so
on, with the appropriate functions to transform the data, and you
subdivide these into day, month, year and so on, by the granularity of
your choice.

When you store a value, the historical data is aggregated
appropriately -- at appropriate levels of granularity, so the current
day database is more precise than the monthly one, and so on -- and
you always have access to the exact current data. RRDTool is used by
software such as Munin and Cacti that track a huge number of readings
over time for graphing.

If you require precise data with the ability to filter, aggregate and
correlate over multiple dimensions, something like Hadoop -- or one of
the Hadoop-based column database implementations, such as HBase or
Hypertable -- might be a better option, combined with MapReduce/Pig to
execute analysis jobs

A.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance