On Fri, Apr 5, 2013 at 11:40 AM, Merlin Moncure <mmoncure@xxxxxxxxx> wrote: > > On Fri, Apr 5, 2013 at 11:54 AM, Dave Johansen <davejohansen@xxxxxxxxx> wrote: > > On Sat, Feb 26, 2011 at 2:38 PM, Dave Johansen <davejohansen@xxxxxxxxx> > > wrote: > >> > >> Unfortunately, I'm running 8.3.3 and to my knowledge the windowing stuff > >> wasn't added til 8.4. > >> Dave > >> > >> On Feb 26, 2011 2:06 PM, "Josh Berkus" <josh@xxxxxxxxxxxx> wrote: > >> > Dave, > >> > > >> > Why not test the windowing version I posted? > > > > We finally have moved over to 8.4 and so I just wanted to post the > > time comparison numbers to show the times on 8.4 as well. This is also > > a newer data set with ~700k rows and ~4k distinct id_key values. > > > > 1) Dependent subquery > > SELECT a.id_key, a.time_stamp, a.value FROM data AS a WHERE > > a.time_stamp = (SELECT MAX(time_stamp) FROM data AS b WHERE a.id_key = > > b.id_key); > > 8.3.3: Killed it after a few minutes > > 8.4.13: Killed it after a few minutes > > > > 2) Join against temporary table > > SELECT a.id_key, a.time_stamp, a.value FROM data AS a JOIN (SELECT > > id_key, MAX(time_stamp) AS max_time_stamp FROM data GROUP BY id_key) > > AS b WHERE a.id_key = b.id_key AND a.time_stamp = b.max_time_stamp; > > 8.3.3: 1.4 s > > 8.4.13: 0.5 s > > > > 3) DISTINCT ON: > > SELECT DISTINCT ON (id_key) id_key, time_stamp, value FROM data ORDER > > BY id_key, time_stamp DESC; > > Without Index: > > 8.3.3: 34.1 s > > 8.4.13: 98.7 s > > With Index (data(id_key, time_stamp DESC)): > > 8.3.3: 3.4 s > > 8.4.13: 1.3 s > > > > 4) Auto-populated table > > SELECT id_key, time_stamp, value FROM data WHERE rid IN (SELECT rid > > FROM latestdata); > > 8.3.3: 0.2 s > > 8.4.13: 0.06 s > > > > 5) Windowing > > SELECT id_key, time_stamp, value FROM (SELECT id_key, time_stamp, > > value, row_number() OVER (PARTITION BY id_key ORDER BY time_stamp > > DESC) AS ranking FROM data) AS a WHERE ranking=1; > > 8.3.3: N/A > > 8.4.13: 1.6 s > > I would also test: > > *) EXISTS() > > SELECT a.id_key, a.time_stamp, a.value FROM data > WHERE NOT EXISTS > ( > SELECT 1 FROM data b > WHERE > a.id_key = b.id_key > and b.time_stamp > a.time_stamp > ); I tried this and it was slow: 8.3.3: 674.4 s 8.4.13: 40.4 s > > *) custom aggregate (this will not be the fastest option but is a good > technique to know -- it can be a real life saver when selection > criteria is complex) > > CREATE FUNCTION agg_latest_data(data, data) returns data AS > $$ > SELECT CASE WHEN $1 > $2 THEN $1 ELSE $2 END; > $$ LANGUAGE SQL IMMUTABLE; > > CREATE AGGREGATE latest_data ( > SFUNC=agg_latest_data, > STYPE=data > ); > > SELECT latest_data(d) FROM data d group by d.id_key; > > the above returns the composite, not the fields, but that can be worked around. My real table actually returns/needs all the values from the row so I didn't feel like messing with aggregate stuff. Thanks, Dave -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance