I was just about to reply to the thread, thanks for asking. Clustering was key. After rebooting the machine (just to make sure absolutely nothing was cached), I immediately ran a report on Toronto: 5.25 seconds!
Here's what I did:
- Created a new set of tables that matched the old set, with statistics of 1000 on the station and taken (date) columns.
- Inserted the data from the old hierarchy into the new set, ordered by station id then by date (same seven child tables as before: one per category).
- I wanted to ensure a strong correlation between primary key and station id.
- Added three indexes per table: (a) station id; (b) date taken; and (c) station-taken-category.
- Set the station-taken-category index as CLUSTER.
- Vacuumed the new tables.
- Dropped the old tables.
- Set the following configuration values:
- shared_buffers = 1GB
- temp_buffers = 32MB
- work_mem = 32MB
- maintenance_work_mem = 64MB
- seq_page_cost = 1.0
- random_page_cost = 2.0
- cpu_index_tuple_cost = 0.001
- effective_cache_size = 512MB
- Vancouver: 4.2s
- Yellowknife: 1.7s
- Montreal: 6.5s
- Trois-Riviers: 2.8s
Physically ordering the data by station ids triggers index use every time.
Next week's hardware upgrade should halve those times -- unless anyone has further suggestions to squeeze more performance out of PG. ;-)
Dave