On Mon, Jun 20, 2011 at 3:31 PM, Jon Nelson <jnelson+pgsql@xxxxxxxxxxx> wrote: > On Mon, Jun 20, 2011 at 11:08 AM, Tom Lane <tgl@xxxxxxxxxxxxx> wrote: >> Jon Nelson <jnelson+pgsql@xxxxxxxxxxx> writes: >>> I ran a query recently where the result was very large. The outer-most >>> part of the query looked like this: >> >>> HashAggregate (cost=56886512.96..56886514.96 rows=200 width=30) >>> -> Result (cost=0.00..50842760.97 rows=2417500797 width=30) >> >>> The row count for 'Result' is in the right ballpark, but why does >>> HashAggregate think that it can turn 2 *billion* rows of strings (an >>> average of 30 bytes long) into only 200? >> >> 200 is the default assumption about number of groups when it's unable to >> make any statistics-based estimate. You haven't shown us any details so >> it's hard to say more than that. > > What sorts of details would you like? The row count for the Result > line is approximately correct -- the stats for all tables are up to > date (the tables never change after import). statistics is set at 100 > currently. The query and the full EXPLAIN output (attached as text files) would be a good place to start.... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance