Re: Bad plan for nested loop + limit

Alexander Staubo <alex@xxxxxxxxxx> · Fri, 27 Feb 2009 21:18:42 +0100

On Sun, Feb 15, 2009 at 5:45 PM, Alexander Staubo <alex@xxxxxxxxxx> wrote:
> On Sun, Feb 15, 2009 at 5:29 AM, David Wilson <david.t.wilson@xxxxxxxxx> wrote:
>> On Sat, Feb 14, 2009 at 5:25 PM, Alexander Staubo <alex@xxxxxxxxxx> wrote:
>>>
>>> Output from "explain analyze":
>>>
>>>  Limit  (cost=0.00..973.63 rows=4 width=48) (actual
>>> time=61.554..4039.704 rows=1 loops=1)
>>>   ->  Nested Loop  (cost=0.00..70101.65 rows=288 width=48) (actual
>>> time=61.552..4039.700 rows=1 loops=1)
>>>         ->  Nested Loop  (cost=0.00..68247.77 rows=297 width=52)
>>> (actual time=61.535..4039.682 rows=1 loops=1)
>>
>> Those estimates are pretty far off. Did you try increasing the
>> statistics target? Also, is the first query repeatable (that is, is it
>> already in cache when you do the test, or alternately, are all queries
>> *out* of cache when you test?)

All right, this query keeps coming back to bite me. If this part of the join:

  ... and section_items.sandbox_id = 16399

yields a sufficiently large number of matches, then performance goes
'boink', like so:

 Limit  (cost=0.00..34.86 rows=4 width=48) (actual
time=4348.696..4348.696 rows=0 loops=1)
   ->  Nested Loop  (cost=0.00..60521.56 rows=6944 width=48) (actual
time=4348.695..4348.695 rows=0 loops=1)
         ->  Index Scan using index_event_occurrences_on_start_time on
event_occurrences  (cost=0.00..11965.38 rows=145712 width=48) (actual
time=0.093..138.029 rows=145108 loops=1)
               Index Cond: (start_time > '2009-02-27
18:01:14.739411+01'::timestamp with time zone)
         ->  Index Scan using
index_section_items_on_subject_type_and_subject_id on section_items
(cost=0.00..0.32 rows=1 width=4) (actual time=0.029..0.029 rows=0
loops=145108)
               Index Cond: (((section_items.subject_type)::text =
'Event'::text) AND (section_items.subject_id =
event_occurrences.event_id))
               Filter: (section_items.sandbox_id = 9)
 Total runtime: 4348.777 ms

In this case:

# select count(*) from section_items where sandbox_id = 9;
 count
-------
  3126

If I remove the start_time > ... clause, performance is fine. Upping
the statistics setting on any of the columns involved seems to have no
effect.

Is this a pathological border case, or is there something I can do to
*generally* make this query run fast? Keep in mind that the query
itself returns no rows at all. I want to avoid doing an initial
"select count(...)" just to avoid the bad plan. Suffice to say, having
a web request take 5 seconds is asking too much from our users.

Alexander.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance