On Thu, May 28, 2009 at 2:54 PM, Ivan Voras <ivoras@xxxxxxxxxxx> wrote: > The volume of sensor data is potentially huge, on the order of 500,000 > updates per hour. Sensor data is few numeric(15,5) numbers. The size of that dataset, combined with the apparent simplicity of your schema and the apparent requirement for most-sequential access (I'm guessing about the latter two), all lead me to suspect you would be happier with something other than a traditional relational database. I don't know how exact your historical data has to be. Could you get by with something like RRDTool? RRdTool is a round-robin database that stores multiple levels of historical values aggregated by function. So you typically create an "average" database, a "max" database and so on, with the appropriate functions to transform the data, and you subdivide these into day, month, year and so on, by the granularity of your choice. When you store a value, the historical data is aggregated appropriately -- at appropriate levels of granularity, so the current day database is more precise than the monthly one, and so on -- and you always have access to the exact current data. RRDTool is used by software such as Munin and Cacti that track a huge number of readings over time for graphing. If you require precise data with the ability to filter, aggregate and correlate over multiple dimensions, something like Hadoop -- or one of the Hadoop-based column database implementations, such as HBase or Hypertable -- might be a better option, combined with MapReduce/Pig to execute analysis jobs A. -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance