Re: Slow query with a lot of data

Moritz Onken <onken@xxxxxxxxxxxxxxxx> · Thu, 21 Aug 2008 09:45:32 +0200

Am 21.08.2008 um 09:04 schrieb Moritz Onken:

Am 20.08.2008 um 20:28 schrieb Tom Lane:

"Scott Carey" <scott@xxxxxxxxxxxxxxxxx> writes:
The planner actually thinks there will only be 28704 rows returned  
of width
12.  But it chooses to sort 53 million rows before aggregating.   
Thats
either a bug or there's something else wrong here.   That is the  
wrong way
to aggregate those results no matter how much work_mem you have  
unless I'm
completely missing something...

That does look weird.  What are the datatypes of the columns being
grouped by?  Maybe they're not hashable?

Another forcing function that prevents use of HashAgg is DISTINCT
aggregates, but you don't seem to have any in this query...

			regards, tom lane

The datatypes are both integers. There is no DISTINCT in this query.
Thanks anyway!

insert into setup1 (select
  a."user", b.category, sum(1.0/b.cat_count)::float
  from result a, domain_categories b
  where a."domain" = b."domain"
  and b.depth < 4
  and a.results > 100
  group by a."user", b.category);

This query inserted a total of 16,000,000 rows and, with work_mem set  
to 3000mb,
took about 24 hours.

Any more ideas to speed this up?