On Sun, Feb 15, 2009 at 5:45 PM, Alexander Staubo <alex@xxxxxxxxxx> wrote: > On Sun, Feb 15, 2009 at 5:29 AM, David Wilson <david.t.wilson@xxxxxxxxx> wrote: >> On Sat, Feb 14, 2009 at 5:25 PM, Alexander Staubo <alex@xxxxxxxxxx> wrote: >>> >>> Output from "explain analyze": >>> >>> Limit (cost=0.00..973.63 rows=4 width=48) (actual >>> time=61.554..4039.704 rows=1 loops=1) >>> -> Nested Loop (cost=0.00..70101.65 rows=288 width=48) (actual >>> time=61.552..4039.700 rows=1 loops=1) >>> -> Nested Loop (cost=0.00..68247.77 rows=297 width=52) >>> (actual time=61.535..4039.682 rows=1 loops=1) >> >> Those estimates are pretty far off. Did you try increasing the >> statistics target? Also, is the first query repeatable (that is, is it >> already in cache when you do the test, or alternately, are all queries >> *out* of cache when you test?) All right, this query keeps coming back to bite me. If this part of the join: ... and section_items.sandbox_id = 16399 yields a sufficiently large number of matches, then performance goes 'boink', like so: Limit (cost=0.00..34.86 rows=4 width=48) (actual time=4348.696..4348.696 rows=0 loops=1) -> Nested Loop (cost=0.00..60521.56 rows=6944 width=48) (actual time=4348.695..4348.695 rows=0 loops=1) -> Index Scan using index_event_occurrences_on_start_time on event_occurrences (cost=0.00..11965.38 rows=145712 width=48) (actual time=0.093..138.029 rows=145108 loops=1) Index Cond: (start_time > '2009-02-27 18:01:14.739411+01'::timestamp with time zone) -> Index Scan using index_section_items_on_subject_type_and_subject_id on section_items (cost=0.00..0.32 rows=1 width=4) (actual time=0.029..0.029 rows=0 loops=145108) Index Cond: (((section_items.subject_type)::text = 'Event'::text) AND (section_items.subject_id = event_occurrences.event_id)) Filter: (section_items.sandbox_id = 9) Total runtime: 4348.777 ms In this case: # select count(*) from section_items where sandbox_id = 9; count ------- 3126 If I remove the start_time > ... clause, performance is fine. Upping the statistics setting on any of the columns involved seems to have no effect. Is this a pathological border case, or is there something I can do to *generally* make this query run fast? Keep in mind that the query itself returns no rows at all. I want to avoid doing an initial "select count(...)" just to avoid the bad plan. Suffice to say, having a web request take 5 seconds is asking too much from our users. Alexander. -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance