Re: Very poor performance

Aaron Burnett <aburnett@xxxxxxxxxxxx> · Tue, 17 Aug 2010 13:54:09 -0400

Thanks for the response kevin. Answers interspersed below.

On 8/17/10 10:18 AM, "Kevin Grittner" <Kevin.Grittner@xxxxxxxxxxxx> wrote:

> "Aaron Burnett" <aburnett@xxxxxxxxxxxx> wrote:
>  
>> Postgres Version 8.25
>  
> Do you mean 8.2.5?  (There is no PostgreSQL version 8.25.)
>  

Yeah, missed a '.', it's 8.2.5

> If you're concerned about performance and you're still on 8.2, you
> might want to consider updating to a new major version.
>  
>> 16 Gig RAM
>> 192MB work_mem  (increasing to 400MB didn't change the outcome)
>  
> What other non-default settings do you have?

maintenance_work_mem = 1024MB
max_stack_depth = 8MB
max_fsm_pages = 8000000
max_fsm_relations = 2000

>  
>> explain analyze select distinct(id) from member  where id in
>> (select memberid from answerselectinstance where   nswerid =
>> 127443  OR answerid = 127444  OR  answerid = 127445  OR  answerid
>> = 127446  OR  answerid = 127447  OR  answerid = 127448   ) ;
>  
> How does this do?:
>  
> explain analyze
> select distinct(m.id)
>   from answerselectinstance a
>   join member m
>     on m.id = a.memberid
>   where a.answerid between 127443 and 127448
> ;
>  
> -Kevin

Unfortunately because of the way the application does the building of the
variables (answerid) and the query, these were only coincidentally in
numeric order, so the query and resulting plan will look more like this:
(and it finishes fast)

LOG:  duration: 4875.943 ms  statement: explain analyze select
distinct(m.id)
  from answerselectinstance a
  join member m
    on m.id = a.memberid
  where a.answerid = 127443  OR  answerid = 127444  OR
a.answerid = 127445  OR a.answerid = 127446  OR a.answerid = 127447 OR
a.answerid = 127448;

QUERY PLAN         
----------------------------------------------------------------------------
----------------------------------------------------------------------------
--------------------
 Unique  (cost=265346.57..265884.69 rows=107623 width=4) (actual
time=4362.948..4751.042 rows=143563 loops=1)
   ->  Sort  (cost=265346.57..265615.63 rows=107623 width=4) (actual
time=4362.945..4489.002 rows=143820 loops=1)
         Sort Key: m.id
         ->  Hash Join  (cost=112462.72..256351.64 rows=107623 width=4)
(actual time=2246.333..4134.240 rows=143820 loops=1)
               Hash Cond: (a.memberid = m.id)
               ->  Bitmap Heap Scan on answerselectinstance a
(cost=1363.57..142561.92 rows=107623 width=4) (actual time=84.082..1447.093
rows=143820 loops=1)
                     Recheck Cond: ((answerid = 127443) OR (answerid =
127444) OR (answerid = 127445) OR (answerid = 127446) OR (answerid = 127447)
OR (answerid = 127448))
                     ->  BitmapOr  (cost=1363.57..1363.57 rows=107651
width=0) (actual time=41.723..41.723 rows=0 loops=1)
                           ->  Bitmap Index Scan on asi_answerid_idx
(cost=0.00..200.36 rows=17942 width=0) (actual time=8.133..8.133 rows=32614
loops=1)
                                 Index Cond: (answerid = 127443)
                           ->  Bitmap Index Scan on asi_answerid_idx
(cost=0.00..200.36 rows=17942 width=0) (actual time=6.498..6.498 rows=23539
loops=1)
                                 Index Cond: (answerid = 127444)
                           ->  Bitmap Index Scan on asi_answerid_idx
(cost=0.00..200.36 rows=17942 width=0) (actual time=5.935..5.935 rows=20368
loops=1)
                                 Index Cond: (answerid = 127445)
                           ->  Bitmap Index Scan on asi_answerid_idx
(cost=0.00..200.36 rows=17942 width=0) (actual time=6.619..6.619 rows=21812
loops=1)
                                 Index Cond: (answerid = 127446)
                           ->  Bitmap Index Scan on asi_answerid_idx
(cost=0.00..200.36 rows=17942 width=0) (actual time=3.039..3.039 rows=9562
loops=1)
                                 Index Cond: (answerid = 127447)
                           ->  Bitmap Index Scan on asi_answerid_idx
(cost=0.00..200.36 rows=17942 width=0) (actual time=11.490..11.490
rows=35925 loops=1)
                                 Index Cond: (answerid = 127448)
               ->  Hash  (cost=103267.40..103267.40 rows=626540 width=4)
(actual time=2161.933..2161.933 rows=626626 loops=1)
                     ->  Seq Scan on member m  (cost=0.00..103267.40
rows=626540 width=4) (actual time=0.009..1467.145 rows=626626 loops=1)
 Total runtime: 4875.015 ms

I got it to run a million times faster than in it's original form simply by
removing the 'distinct' on the m.id because m.id is a primary key and adding
the distinct to a.memberid, but by changing the query in any way it breaks
some other part of the application as this is just a small part of the total
"building process".

I may be stuck between a rock and a very hard place as we don't have the
resources at this time for someone to rewite the whole building (this is
just a tiny part of the process that does what we call 'group building')
procedure.

Thanks to everyone that has responded thus far. Your input is appreciated
and welcomed.

Aaron

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance