I am using postgresql 8.3-beta3. I have a table 'test' with three fields:
sid data date
1 1.1 2009-09-01 1:00:00
1 2.1 2010-01-01 1:00:20
2 3.1 2009-09-01 1:00:10
2 0.1 2010-01-01 1:00:30
I create index for data field.
Each sid may have millions of rows.
I want to get maximum data value and corresponding "time" for each group of sid. Here is my query:
########################
select t1.sid , max(t1.data)
(select t2.date
from test t2,
where t2.sid = t1.sid and
t2.date between '2009-08-01' and '2010-01-02' and
order by t2.data DESC limit 1
)
from test t1
where t1.date between '2009-08-01' and '2010-01-08' and
group by t1.sid
##########################
But when max() in postgresql may slow down the search when there are millions of rows for each sid.
So I use " order by t2.data DESC limit 1" to find max:
########################
select t1.sid ,
(select t2.data
from test t2,
where t2.sid = t1.sid and
t2.date between '2009-08-01' and '2010-01-02' and
order by t2.data DESC limit 1
) (select t2.date
from test t2,
where t2.sid = t1.sid and
t2.date between '2009-08-01' and '2010-01-02' and
order by t2.data DESC limit 1
)
from test t1
where t1.date between '2009-08-01' and '2010-01-08' and
group by t1.sid
##########################
The second query looks "strange" since similar search is done twice.
Because of two fields, the following can not be used directly in the above query.
(select t2.date, t2.data
from test t2,
where t2.sid = t1.sid and
t2.date between '2009-08-01' and '2010-01-02' and
order by t2.data DESC limit 1
)
Any suggestions for the best way to get maximum data value and corresponding "time" for each group of sid in my case?
Thanks a lot.
ouyang