On Aug 16, 2006, at 3:51 PM, Tom Lane wrote:
/* Select all sheep who's most recent transfer was into the
subject flock */
SELECT DISTINCT f1.regn_no, f1.transfer_date as date_in
FROM SHEEP_FLOCK f1 JOIN
/* The last transfer date for each sheep */
(SELECT f.regn_no, MAX(f.transfer_date) as last_xfer_date
FROM SHEEP_FLOCK f
GROUP BY f.regn_no) f2
ON f1.regn_no = f2.regn_no
WHERE f1.flock_no = '1359'
AND f1.transfer_date = f2.last_xfer_date
This seems pretty closely related to this recent thread:
http://archives.postgresql.org/pgsql-performance/2006-08/msg00220.php
in which the OP is doing a very similar kind of query in almost
exactly
the same way.
I can't help thinking that there's probably a better way to phrase
this
type of query in SQL, though it's not jumping out at me what that is.
I don't know about better, but I tend to phrase these in a quite
different way that's (hopefully) equivalent:
select latest.regn_no,
latest.transfer_date as date_in
from sheep_flock latest
where not exists (
select 'x'
from sheep_flock even_later
where latest.regn_no = even_later.regn_no
and latest.transfer_date < even_later.transfer_date)
and latest.flock_no = '1359'
There's no MAX() or DISTINCT here, so maybe this is easier to optimize?
--
Scott Lamb <http://www.slamb.org/>