bad plan: 8.4.8, hashagg, work_mem=1MB.

Jon Nelson <jnelson+pgsql@xxxxxxxxxxx> · Mon, 20 Jun 2011 10:53:52 -0500

PostgreSQL 8.4.8 on CentOS 5.6, x86_64. Default settings except work_mem = 1MB.

NOTE: I am using partitioned tables here, and was querying the
'master' table. Perhaps is this a Known Issue.

I ran a query recently where the result was very large. The outer-most
part of the query looked like this:

 HashAggregate  (cost=56886512.96..56886514.96 rows=200 width=30)
   ->  Result  (cost=0.00..50842760.97 rows=2417500797 width=30)

The row count for 'Result' is in the right ballpark, but why does
HashAggregate think that it can turn 2 *billion* rows of strings (an
average of 30 bytes long) into only 200?  This is my primary concern.
If I don't disable hash aggregation, postgresql quickly consumes huge
quantities of memory and eventually gets killed by the OOM manager.

After manually disabling hash aggregation, I ran the same query. It's
been running for over 2 days now. The disk is busy but actual data
transferred is very low. Total data size is approx. 250GB, perhaps a
bit less.

The query scans 160 or so tables for data. If I use a distinct + union
on each table, the plan looks like this:

 Unique  (cost=357204094.44..357318730.75 rows=22927263 width=28)
   ->  Sort  (cost=357204094.44..357261412.59 rows=22927263 width=28)

23 million rows is more like it, and the cost is much lower. What is
the possibility that distinct/unique operations can be pushed "down"
into queries during the planning stage to see if they are less
expensive?

In this case, postgresql believes (probably correctly, I'll let you
know) that distinct(column foo from tableA + column foo from tableB +
column foo from tableC ...) is more expensive than distinct(distinct
column foo from tableA + distinct column foo from tableB .... ).

-- 
Jon

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance