On Thu, 20 Aug 2009, Sanjay Arora wrote:
What exactly the difference in layman sys admin terms between our
everyday postgres and this time series data.
There is a good introduction to where regular databases fail to solve
time-series data problems at http://cs.nyu.edu/shasha/papers/jagtalk.html
As mentioned there, one of the very hard problems to solve in a general
way is how to deal with all the gaps in your data when the market isn't
open, which really complicates how you compute indicators. For example,
in regular SQL you might compute an average over some period using
something like:
select sum(x) / count(*) where ts>=t1 and ts<=t2
You might think that you could pick t1 and t2 here based on the number of
samples you want to average; let's say you want an average over the last
minute of data, so you try this:
t1=<something>
t2=<something> + interval('1 minute')
This completely falls apart when when the data isn't continuous. If the
market was closed for some period between t1 and t2, you need to use a
rolling window over the data you've got instead. As of PostgreSQL 8.4's
support for SQL window fuctions, it's easier to compute this sort of
thing, but there's still plenty of rough spots to get nailed by.
--
* Greg Smith gsmith@xxxxxxxxxxxxx http://www.gregsmith.com Baltimore, MD
--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general