Thanks for the response kevin. Answers interspersed below. On 8/17/10 10:18 AM, "Kevin Grittner" <Kevin.Grittner@xxxxxxxxxxxx> wrote: > "Aaron Burnett" <aburnett@xxxxxxxxxxxx> wrote: > >> Postgres Version 8.25 > > Do you mean 8.2.5? (There is no PostgreSQL version 8.25.) > Yeah, missed a '.', it's 8.2.5 > If you're concerned about performance and you're still on 8.2, you > might want to consider updating to a new major version. > >> 16 Gig RAM >> 192MB work_mem (increasing to 400MB didn't change the outcome) > > What other non-default settings do you have? maintenance_work_mem = 1024MB max_stack_depth = 8MB max_fsm_pages = 8000000 max_fsm_relations = 2000 > >> explain analyze select distinct(id) from member where id in >> (select memberid from answerselectinstance where nswerid = >> 127443 OR answerid = 127444 OR answerid = 127445 OR answerid >> = 127446 OR answerid = 127447 OR answerid = 127448 ) ; > > How does this do?: > > explain analyze > select distinct(m.id) > from answerselectinstance a > join member m > on m.id = a.memberid > where a.answerid between 127443 and 127448 > ; > > -Kevin Unfortunately because of the way the application does the building of the variables (answerid) and the query, these were only coincidentally in numeric order, so the query and resulting plan will look more like this: (and it finishes fast) LOG: duration: 4875.943 ms statement: explain analyze select distinct(m.id) from answerselectinstance a join member m on m.id = a.memberid where a.answerid = 127443 OR answerid = 127444 OR a.answerid = 127445 OR a.answerid = 127446 OR a.answerid = 127447 OR a.answerid = 127448; QUERY PLAN ---------------------------------------------------------------------------- ---------------------------------------------------------------------------- -------------------- Unique (cost=265346.57..265884.69 rows=107623 width=4) (actual time=4362.948..4751.042 rows=143563 loops=1) -> Sort (cost=265346.57..265615.63 rows=107623 width=4) (actual time=4362.945..4489.002 rows=143820 loops=1) Sort Key: m.id -> Hash Join (cost=112462.72..256351.64 rows=107623 width=4) (actual time=2246.333..4134.240 rows=143820 loops=1) Hash Cond: (a.memberid = m.id) -> Bitmap Heap Scan on answerselectinstance a (cost=1363.57..142561.92 rows=107623 width=4) (actual time=84.082..1447.093 rows=143820 loops=1) Recheck Cond: ((answerid = 127443) OR (answerid = 127444) OR (answerid = 127445) OR (answerid = 127446) OR (answerid = 127447) OR (answerid = 127448)) -> BitmapOr (cost=1363.57..1363.57 rows=107651 width=0) (actual time=41.723..41.723 rows=0 loops=1) -> Bitmap Index Scan on asi_answerid_idx (cost=0.00..200.36 rows=17942 width=0) (actual time=8.133..8.133 rows=32614 loops=1) Index Cond: (answerid = 127443) -> Bitmap Index Scan on asi_answerid_idx (cost=0.00..200.36 rows=17942 width=0) (actual time=6.498..6.498 rows=23539 loops=1) Index Cond: (answerid = 127444) -> Bitmap Index Scan on asi_answerid_idx (cost=0.00..200.36 rows=17942 width=0) (actual time=5.935..5.935 rows=20368 loops=1) Index Cond: (answerid = 127445) -> Bitmap Index Scan on asi_answerid_idx (cost=0.00..200.36 rows=17942 width=0) (actual time=6.619..6.619 rows=21812 loops=1) Index Cond: (answerid = 127446) -> Bitmap Index Scan on asi_answerid_idx (cost=0.00..200.36 rows=17942 width=0) (actual time=3.039..3.039 rows=9562 loops=1) Index Cond: (answerid = 127447) -> Bitmap Index Scan on asi_answerid_idx (cost=0.00..200.36 rows=17942 width=0) (actual time=11.490..11.490 rows=35925 loops=1) Index Cond: (answerid = 127448) -> Hash (cost=103267.40..103267.40 rows=626540 width=4) (actual time=2161.933..2161.933 rows=626626 loops=1) -> Seq Scan on member m (cost=0.00..103267.40 rows=626540 width=4) (actual time=0.009..1467.145 rows=626626 loops=1) Total runtime: 4875.015 ms I got it to run a million times faster than in it's original form simply by removing the 'distinct' on the m.id because m.id is a primary key and adding the distinct to a.memberid, but by changing the query in any way it breaks some other part of the application as this is just a small part of the total "building process". I may be stuck between a rock and a very hard place as we don't have the resources at this time for someone to rewite the whole building (this is just a tiny part of the process that does what we call 'group building') procedure. Thanks to everyone that has responded thus far. Your input is appreciated and welcomed. Aaron -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance