Re: Stock Market Price Data & postgreSQL? HELLPPP Please

Tom Lane <tgl@xxxxxxxxxxxxx> · Wed, 19 Aug 2009 18:04:21 -0400

Sanjay Arora <sanjay.k.arora@xxxxxxxxx> writes:
> I want to store stock market price data in postgreSQL and some of my
> associates are saying this is not workable, not only in postgreSQL but
> in any database. Arguments given are listed below. Kindly comment and
> if possible suggest solutions and pointers to further reading.

Actually, the way I came to use Postgres was from a company that was
working on technical analysis of market data.  I would recommend looking
at a hybrid approach.  Storing raw tick data in a DBMS at one row per
tick is indeed not very bright.  You could possibly make it work if
you throw beefy hardware at the problem, but there's not much point
because you're really not playing to a DBMS's strengths when you do
that.  Once it's arrived the data is static (barring very-infrequent
corrections, which in practice you might never do at all anyway).
And the access patterns for it (at least if you're doing the same type
of technical analysis we were doing) are extremely predictable and
linear.  So you aren't doing anything wherein SQL will shine.  On the
other hand, there are definitely components of the big picture where
SQL *can* shine.  I'd strongly recommend tracking your live trading
positions in an RDBMS, for example.  It might be worth managing your
tick data using DBMS entries that represent, say, monthly tick data
files.  Activities like removing old data would get a lot easier that
way.  (I wish this had occurred to me twelve years ago, but it's water
over the dam now.)  Any performance issues can be avoided by having
the technical analysis processes read the static tick data files
directly.  This is generally pretty similar to the frequently-asked
question "should I store a lot of big image or document files directly
in the database, or just use it as an index of external files?".
If you dig around in the PG list archives you'll find more than you want
to read about that on both sides.  If you've aggregated the ticks in
suitable volumes you can solve it either way, really, but an external
file is a bit easier to append to during raw data collection.

Don't fall into the trap of assuming that all your requirements must be
solved by a single tool.  You'll spend all your time building the
perfect tool, and go broke before you finish it.  Use a DBMS for the
parts of the problem it's good for, and go outside it for what it isn't.

BTW, we were doing full-speed tick data collection and real-time trading
analysis in the nineties, on machines that my current cell phone would
leave in the dust.  The market data volume has grown a lot since then
of course, but the price of hardware has fallen a lot more.  I'd not
recommend designing your system on the assumption that raw per-tick
speed is the be-all and end-all.

			regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general