Re: parallel query evaluation

Tom Lane <tgl@xxxxxxxxxxxxx> · Sat, 10 Nov 2012 10:32:25 -0500

Oliver Seidel <postgresql@xxxxxxxxxxx> writes:
> I have
>              create table x ( att bigint, val bigint, hash varchar(30) 
> );
> with 693million rows.  The query

>              create table y as select att, val, count(*) as cnt from x 
> group by att, val;

> ran for more than 2000 minutes and used 14g memory on an 8g physical 
> RAM machine

What was the plan for that query?  What did you have work_mem set to?

I can believe such a thing overrunning memory if the planner chose to
use a hash-aggregation plan instead of sort-and-unique, but it would
only do that if it had made a drastic underestimate of the number of
groups implied by the GROUP BY clause.  Do you have up-to-date
statistics for the source table?

			regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance