Search Postgresql Archives

Re: Design Question (Time Series Data)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ted Byers wrote:

If you really have such a disparity among your series,
then it is a mistake to blend them into a single
table.  You really need to spend more time analyzing
what the data means.  If one data set is comprised of
the daily close price of a suite of stocks or mutual
funds, then it makes sense to include all such series
in a given table, but if some of the series are daily
close price and others are monthly averages, then it
is a mistake to combine them in a single table, and
two or more would be warranted.  Or if the data are
from different data feed vendors, then you have to
think very carefully whether or not the data can
logically be combined.

> I don't much like any of the above.  When I have had
> to process data for financial consultants, I applied a
> few simple filters to ensure the data is clean (e.g.
> tests to ensure data hasn't been corrupted during
> transmission, proper handling of missing data, &c.),
> and then bulk loaded the data into a suite of tables
> designed specifically to match the vendor's
> definitions of what the data means.  Only then did we
> apply specific analyses designed in consultation with
> the financial consultant's specialists; folk best
> qualified to help us understand how best to understand
> the data and especially how it can be combined in a
> meaningful way.

Thank you for your long and detailed answer. It would be possible for me to group those series by frequency or vendor (or even by size), but this would make sql-queries really complicated. Grouping the series by theme would probably work but be a real pain and i doubt whether this has a real advantage for my application.

The real analysis of the data will happen on the client-side, the server has only to ensure that the time series are up to date and not corrupted - it doesn't care about the meaning of the value of any given field (I was probably a little bit unclear about that). It will (with the help of R - via PL/R) do a few simple calculations and transformations, but those can be applied without much knowledge about the properties of the data.

I do not know if PostgreSQL, or any other RDBMS,
includes the ability to call on software such as "R"
to do specific statistical analysis, but if I had to
do some time series analysis, I would do it in a
client application that retrieves the appropriate data
from the database and either does the analysis in
custom code I have written (usually in C++, as some of
my favourite analyses have not made it into commonly
available open source or commercial statistical
software) or invokes the appropriate functions from
statistical software I have at my disposal.  The
strategy I describe above makes the SQL required for
much of this dirt simple.

Exactly, i'm looking for a simple and clean way to store data(possibly transform its frequency) and serve it to the clients (in different formats).

A typical request would be "Give me quarterly GDP and quarterly Import figures for the US between 1987-01 to 2007-2 plus the effective federal funds rate (transformed to quarterly averages) for the same period".

The server therefore has to grab the values for the first two series (both are much longer - go back to the 1930's or 40's), extract the values for the requested period (pad it if it's too short). It will do the same for the third series and transform it to quarterly values (only available for months/weeks). Finally it will return a table with a date-column and 3 columns for the requested data.

Most of the stuff will be done in stored procedures, since i don't want to mess around with it in php (which acts as as proxy for the clients).

Andreas

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

              http://archives.postgresql.org/

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux