On Mon, Apr 7, 2008 at 9:11 PM, Stephen Denne <Stephen.Denne@xxxxxxxxxxxxxx> wrote: > > You may be able to make use of an index by rearranging your query to generate a series between your min & max values, testing whether each value is in your table. > > You've got 4252 distinct values, but what is the range of max - min? Say it's 5000 values, you'd do 5000 lookups via an index, unless postgres thought that the number of index based lookups where going to be more expensive than reading the entire table. Upon further investigation, the above works very well: explain analyze select ds from (select generate_series((select datestamp from vals order by datestamp asc limit 1), (select datestamp from vals order by datestamp desc limit 1), 86400) as ds) series where exists (select datestamp from vals where datestamp=ds); QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Subquery Scan series (cost=4.89..8.90 rows=1 width=4) (actual time=0.080..25264.239 rows=4252 loops=1) Filter: (subplan) -> Result (cost=4.89..4.90 rows=1 width=0) (actual time=0.051..7.491 rows=6163 loops=1) InitPlan -> Limit (cost=0.00..2.45 rows=1 width=4) (actual time=0.023..0.024 rows=1 loops=1) -> Index Scan using val_datestamp_idx on vals (cost=0.00..184401441.14 rows=75391472 width=4) (actual time=0.021..0.021 rows=1 loops=1) -> Limit (cost=0.00..2.45 rows=1 width=4) (actual time=0.020..0.020 rows=1 loops=1) -> Index Scan Backward using val_datestamp_idx on validations (cost=0.00..184401441.14 rows=75391472 width=4) (actual time=0.018..0.018 rows=1 loops=1) SubPlan -> Index Scan using val_datestamp_idx on validations (cost=0.00..70453.21 rows=17685 width=4) (actual time=4.096..4.096 rows=1 loops=6163) Index Cond: (datestamp = $0) Total runtime: 25267.033 ms (12 rows) The series generates all the possible datestamps + about 40% extra. What's particularly interesting here to me is that it at least seems like this validates my original assumption that if the planner could be coaxed into using the index it would be faster- or am I missing something? This query, at 25 seconds, was certainly much faster than even the GROUP BY version that ran in 120 seconds. As before, thanks for all of the information and ideas. Down from 722 seconds to 25 seconds is a hefty improvement. -- - David T. Wilson Princeton Satellite Systems david.t.wilson@xxxxxxxxx