Re: Need help with complicated SQL statement

Shane Ambler <pgsql@xxxxxxxxxx> · Tue, 20 Nov 2007 18:04:28 +1030

Ted Byers wrote:
Thanks Shane,

It works reasonably well.  It gets the right answer,
but I guess my data set is much larger than your test.

What indexes have you got?

Using this index on the sample I sent gets the response time to about 
5ms (per stock_id) (as opposed to 900ms with these columns indexed 
separately)

CREATE INDEX idx_stockprices_id_date ON stockprices (stock_id,price_date);

Please consider the appended data.

The first two SQL statements are directly comparable. 
My Left join is marginally simpler, as shown by
EXPLAIN, and runs to completion in about 3 seconds
(elapsed real time), as opposed to about 30 seconds
for the two views.  It makes a little sense, though,
in that according to EXPLAIN, the LEFT JOIN needs to
consider a dramatically smaller number of rows.  What
I find puzzling, though, is that it takes less time to
get the report for 28 stocks at a given time than it
does to get the report for 1. (Both take about 30
seconds, but for 28 stocks, it takes about 0.005
seconds less time ;-)

This is a case where LEFT JOINS appear to be much
faster than subqueries.

I appreciate all your help, but I am struggling to
figure out how best to adapt my LEFT JOINs in your
VIEWs, so that the latter benefit from the speed of
the JOINs.  The heart of my problem is to figure out
how to  use a stock_id in the WHERE clause.

That is where I have moved away from your select - the way you are 
joining makes it hard to adapt to where you want it to end up (more than 
one stock_id per query)

By using the view that generates the rows you want you make the query 
sent from the client so much simpler and make it easy to get any single 
or list of stock_id you want.

The second view will add little overhead and can be part of the select 
sent from the client if you wish. I separated them in to two views to 
prevent duplicating the same selects for the calculations. You can merge 
them into one view if you wish - the first view would become a subselect 
for the second view.

One thing I am not certain of is, "Is there a way to
preserve the logic of the WHERE clauses by replacing
the WHERE clause, which I use to sample the time
series at 22 days ago, 66 days ago, 132 days ago &c.,
by a "GROUP BY" clause, grouping by stock_id?  If so,
might that, along with an additional LEFT JOIN, get me
the result I am after?

I created a stored procedure that takes an id argument
(and can usefully invoke it on any stock_id in the
database), but the problem remains as to how to
construct a record set by applying the procedure to
each id in a set of ids returned, e.g., by SELECT
stock_id FROM stocks;

Ted

--

Shane Ambler
pgSQL@xxxxxxxxxx

Get Sheeky @ http://Sheeky.Biz

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
      subscribe-nomail command to majordomo@xxxxxxxxxxxxxx so that your
      message can get through to the mailing list cleanly