Re: Optimize date query for large child tables: GiST or GIN?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

The problem is now solved (in theory).

Well, it's not the functions per se that's the problem, it's the lack of
a useful index on the _expression_.

The measurement table indexes (on date and weather station) were not being used because the only given date ranges (e.g., 1900 - 2009) were causing the planner to do a full table scan, which is correct. What I had to do was find a way to reduce the dates so that the planner would actually use the index, rather than doing a full table scan on 43 million records. By passing in 1955 - 1960 the full table scan went away in favour of an index scan, as expected.

Each weather station has a known lifespan (per climate category). That is, not all weather stations between 1880 and 2009 collected data.  For example, one weather station monitored the maximum daily temperature between 2006-11-29 and 2009-12-31. Some stations span more than 30 years, but I believe those are in the minority (e.g., 1896-12-01 to 1959-01-31). (I'll be able to verify once the analysis is finished.)

I will add another table that maps the stations to category and min/max dates. I can then use this reference table which should (theory part here) tell the planner to use the index.

What is really impressive, though... If my understanding is correct...

PostgreSQL scanned 43 million rows 78 times, returning results in ~90 sec.

Thanks again for all your help, everybody. I sincerely appreciate your patience, comments, and ideas.

Dave


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux