On Fri, 21 Nov 2008, Sam Mason wrote:
It's not quite what you're asking for; but have you checked out any of the databases that have resulted from the StreamSQL research?
A streaming database approach is in fact ideally suited to handling this particular problem. Looking at the original request here:
* for a given client (and sensor), and time interval, I need the min, max, and avg of the values; * for a given time interval (and sensor), I need min, max, and avg of the values;
The most efficient way possible to compute these queries is to buffer the full interval worth of data in memory as the values are being inserted, compute these aggregates once the time window for the interval has ended, then write a summarized version of the data. Doing that sort of thing and then operating on the aggregated data, rather than maintaining a bulky index covering every single row, is exactly the sort of thing a good streaming database would handle for you.
I can't comment on the current state of Borealis. But as the original focus of the streaming database research that spawned Truviso where I work was accelerating data capture from sensor networks, I know this general style of approach is quite beneficial here. For example, http://telegraph.cs.berkeley.edu/telegraphcq/v2.1/ is an open-source implementation from that research that's integrated with an older version of PostgreSQL. If you look at the "windowed aggregates" example there, it shows what a streaming query similar to the requirements here would look like: an average and other statistics produced on a per-interval basis.
-- * Greg Smith gsmith@xxxxxxxxxxxxx http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general