Yes, that is what happened, making the tests rather meaningless, and giving me the false impression that the indexes were being used. They were but only because of cached results. When multiple users making different queries, the performance will return to ~80s per query.
I also tried Kevin's suggestion, which had no noticeable effect:
effective_cache_size = 512MB
That said, when using the following condition, the query is fast (1 second):
extract(YEAR FROM sc.taken_start) >= 1963 AND
extract(YEAR FROM sc.taken_end) <= 2009 AND
" -> Index Scan using measurement_013_stc_idx on measurement_013 m (cost=0.00..511.00 rows=511 width=15) (actual time=0.018..3.601 rows=3356 loops=104)"
" Index Cond: ((m.station_id = sc.station_id) AND (m.taken >= sc.taken_start) AND (m.taken <= sc.taken_end) AND (m.category_id = 7))"
This condition makes it slow (13 seconds on first run, 8 seconds thereafter):
extract(YEAR FROM sc.taken_start) >= 1900 AND
extract(YEAR FROM sc.taken_end) <= 2009 AND
" Filter: (category_id = 7)"
" -> Seq Scan on measurement_013 m (cost=0.00..359704.80 rows=18118464 width=15) (actual time=0.008..4025.692 rows=18118395 loops=1)"
At this point, I'm tempted to write a stored procedure that iterates over each station category for all the years of each station. My guess is that the planner's estimate for the number of rows that will be returned by extract(YEAR FROM sc.taken_start) >= 1900 is incorrect and so it chooses a full table scan for all rows. Even though the lower bound appears to be a constant value of the 1900, the average year a station started collecting data was 44 years ago (1965), and did so for an average of 21.4 years.
The part I am having trouble with is convincing PG to use the index for the station ID and the date range for when the station was active. Each station has a unique ID; the data in the measurement table is ordered by measurement date then by station.
Should I add a clustered index by station then by date?
Any other suggestions are very much appreciated.
Dave