Hi, I'm seeing a slow running query. After some optimization with indexes, it appears that the query plan is correct, it's just slow. Running the query twice, not surprisingly, is very fast, due to OS caching or shared_buffers caching. If the parameters for the query are different, however, the query is slow until run a second time. In our usage pattern, this query must run fast the first time it runs. A couple of general stats: this is a linode machine with a single 3GB DB with 4GBs of ram. Shared buffers = 1024mb, effective_cache_size=2048MB. We are running with postgres 9.1. The machine is otherwise dormant when this query runs. Here is the schema: Table "public.premiseaccount" Column | Type | Modifiers ------------------------+--------------------------+------------------------------------------------------------------------- id | integer | not null default nextval('premiseaccount_id_seq'::regclass) created | timestamp with time zone | not null default '2013-09-25 07:00:00+00'::timestamp with time zone modified | timestamp with time zone | not null default '2013-09-25 07:00:00+00'::timestamp with time zone account_id | integer | not null premise_id | integer | not null bucket | character varying(255) | Indexes: "premiseaccount_pkey" PRIMARY KEY, btree (id) "premiseaccount_account_id" btree (account_id) "premiseaccount_bucket" btree (bucket) "premiseaccount_bucket_58c70392619aa36f" btree (bucket, id) "premiseaccount_bucket_like" btree (bucket varchar_pattern_ops) "premiseaccount_premise_id" btree (premise_id) Foreign-key constraints: "account_id_refs_id_529631edfff28022" FOREIGN KEY (account_id) REFERENCES utilityaccount(id) DEFERRABLE INITIALLY DEFERRED "premise_id_refs_id_5ecea2842007328b" FOREIGN KEY (premise_id) REFERENCES premise(id) DEFERRABLE INITIALLY DEFERRED Table "public.electricusage" Column | Type | Modifiers --------------------+--------------------------+------------------------------------------------------------------------ id | integer | not null default nextval('electricusage_id_seq'::regclass) created | timestamp with time zone | not null default '2013-09-25 07:00:00+00'::timestamp with time zone modified | timestamp with time zone | not null default '2013-09-25 07:00:00+00'::timestamp with time zone from_date | timestamp with time zone | not null to_date | timestamp with time zone | not null usage | numeric(9,2) | demand | numeric(9,2) | bill_amount | numeric(9,2) | premise_account_id | integer | not null Indexes: "electricusage_pkey" PRIMARY KEY, btree (id) "electricusage_premise_account_id" btree (premise_account_id) "electricusage_covered_id_from_date_usage" btree (premise_account_id, from_date, usage) Foreign-key constraints: "premise_account_id_refs_id_4c39e54406369128" FOREIGN KEY (premise_account_id) REFERENCES premiseaccount(id) DEFERRABLE INITIALLY DEFERRED For reference, premiseaccount has about 1 million rows, one for each account, grouped in buckets with an average of 5000 accounts per bucket. electricusage has 10 million rows, about 10 rows per premiseaccount. Here is the query: explain analyze SELECT premiseaccount.id, SUM(electricusage.usage) AS total_kwh FROM premiseaccount LEFT OUTER JOIN electricusage ON premiseaccount.id = electricusage.premise_account_id WHERE premiseaccount.bucket = 'XXX' AND electricusage.from_date >= '2012-11-20 00:00:00' GROUP BY premiseaccount.id HAVING SUM(electricusage.usage) BETWEEN 3284 and 3769 LIMIT 50; Limit (cost=0.00..1987.24 rows=50 width=8) (actual time=931.524..203631.435 rows=50 loops=1) -> GroupAggregate (cost=0.00..179487.78 rows=4516 width=8) (actual time=931.519..203631.275 rows=50 loops=1) Filter: ((sum(electricusage.usage) >= 3284::numeric) AND (sum(electricusage.usage) <= 3769::numeric)) -> Nested Loop (cost=0.00..179056.87 rows=36317 width=8) (actual time=101.934..203450.761 rows=30711 loops=1) -> Index Scan using premiseaccount_bucket_58c70392619aa36f on premiseaccount premiseaccount (cost=0.00..14882.30 rows=4516 width=4) (actual time=77.620..7199.527 rows=3437 loops=1) Index Cond: ((bucket)::text = 'XXX'::text) -> Index Scan using electricusage_premise_account_id_36bc8999ced10059 on electricusage electricusage (cost=0.00..36.24 rows=9 width=8) (actual time=8.607..57.055 rows=9 loops=3437) Index Cond: ((premise_account_id = premiseaccount.id) AND (from_date >= '2012-11-20 00:00:00+00'::timestamp with time zone)) Total runtime: 203631.666 ms (see online at: http://explain.depesz.com/s/zeC) What am I missing here? It seems like this is a relatively straightforward foreign key join, using the correct indexes, that is just slow. Warming the OS cache seems like the only way to make this actually perform reasonably, but it seems like it's masking the underlying problem. I could probably do some denormalization, like putting the bucket field on the electricusage table, but that seems like a shoddy solution. Before running my query, I clear the os and postgres cache. I know that in theory some subset of the data will be in the os cache and postgres cache. But in the real world, what we see is that a number of people get timeouts waiting for this query to finish, because it's not in the cache. e.g., running the query with where bucket='xxx' will be fast a second time, but where bucket='yyy' will not. Here are things I've tried: + Clustering premiseaccount on premiseaccount_bucket_58c70392619aa36f and electricusage on electricusage_covered_id_from_date_usage. This provided modest improvement + Fit the whole DB in memory. Since this query is fast if the data is cached, try to fit everything into the cache. Bumping the ram to 8GBs, and warming up the os cache by taring the postgres data directory. On the 4GB machine, I still run into slow performance, but at 8GBs, the entire DB can fit in the OS cache. This made a huge improvement, but will only work for a little while as the database grows. + Tried on other linode servers - same results. + Removing the from_date part of the query. It uses a different index but same performance characteristics. + Looked at kern, sys, and postgres logs. Nothing interesting. I would really appreciate any help I could get! Thanks! Bryce |