sverhagen@xxxxxxxxxx wrote:
> Thanks - I'm sorry that I was not more specific earlier, but what would
> be *really* helpful is the output of explain analyze, since that also
> shows actual time, # rows and # loops of the inner nestloop.
No problem at all.
EXPLAIN ANALYZE SELECT * FROM events_events LEFT OUTER JOIN
events_event_types ON
eventType_id=events_event_types.id WHERE severity=20 AND (eventType_id
IN (71)) ORDER BY datetime DESC limit 50;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=3.23..200.31 rows=50 width=131) (actual
time=0.070..0.341 rows=50 loops=1)
-> Nested Loop (cost=3.23..49139.16 rows=12466 width=131) (actual
time=0.069..0.309 rows=50 loops=1)
-> Index Scan Backward using
events_events_eventtype_id_datetime_ind on events_events
(cost=0.00..48886.61 rows=12466 width=93) (actual time=0.037..0.144
rows=50 loops=1)
Index Cond: (eventtype_id = 71)
-> Materialize (cost=3.23..3.24 rows=1 width=38) (actual
time=0.001..0.001 rows=1 loops=50)
-> Seq Scan on events_event_types (cost=0.00..3.23
rows=1 width=38) (actual time=0.024..0.029 rows=1 loops=1)
Filter: ((id = 71) AND (severity = 20))
Total runtime: 0.415 ms
(8 rows)
Time: 1.290 ms
EXPLAIN ANALYZE SELECT * FROM events_events LEFT OUTER JOIN
events_event_types ON
eventType_id=events_event_types.id WHERE severity=70 AND (eventType_id
IN (71)) ORDER BY datetime DESC limit 50;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=3.23..200.31 rows=50 width=131) (actual
time=11641.775..11641.775 rows=0 loops=1)
-> Nested Loop (cost=3.23..49139.16 rows=12466 width=131) (actual
time=11641.773..11641.773 rows=0 loops=1)
-> Index Scan Backward using
events_events_eventtype_id_datetime_ind on events_events
(cost=0.00..48886.61 rows=12466 width=93) (actual
time=0.035..11573.320 rows=50389 loops=1)
Index Cond: (eventtype_id = 71)
-> Materialize (cost=3.23..3.24 rows=1 width=38) (actual
time=0.000..0.000 rows=0 loops=50389)
-> Seq Scan on events_event_types (cost=0.00..3.23
rows=1 width=38) (actual time=0.028..0.028 rows=0 loops=1)
Filter: ((id = 71) AND (severity = 70))
Total runtime: 11641.839 ms
(8 rows)
Time: 11642.902 ms
EXPLAIN ANALYZE SELECT * FROM events_events LEFT OUTER JOIN
events_event_types ON
eventType_id=events_event_types.id WHERE severity=70 AND (eventType_id
IN (71, 999)) ORDER BY datetime DESC LIMIT 50;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=27290.26..27290.38 rows=50 width=131) (actual
time=0.118..0.118 rows=0 loops=1)
-> Sort (cost=27290.26..27303.17 rows=5164 width=131) (actual
time=0.117..0.117 rows=0 loops=1)
Sort Key: events_events.datetime
Sort Method: quicksort Memory: 17kB
-> Nested Loop (cost=22.95..27118.71 rows=5164 width=131)
(actual time=0.112..0.112 rows=0 loops=1)
-> Seq Scan on events_event_types (cost=0.00..3.02
rows=17 width=38) (actual time=0.016..0.041 rows=16 loops=1)
Filter: (severity = 70)
-> Bitmap Heap Scan on events_events
(cost=22.95..1589.94 rows=408 width=93) (actual time=0.002..0.002
rows=0 loops=16)
Recheck Cond: ((events_events.eventtype_id = ANY
('{71,999}'::bigint[])) AND (events_events.eventtype_id =
events_event_types.id))
-> Bitmap Index Scan on
test_events_events_eventtype_id_severity_ind (cost=0.00..22.85
rows=408 width=0) (actual time=0.001..0.001 rows=0 loops=16)
Index Cond: ((events_events.eventtype_id =
ANY ('{71,999}'::bigint[])) AND (events_events.eventtype_id =
events_event_types.id))
Total runtime: 0.179 ms
(12 rows)
Time: 1.510 ms
> I'm
> wondering though why you do a left outer join. From the \d output in
the
> previous mail, events_event.eventtype_id has a not null constraint
and a
> fk to events_event_types.id, so an inner join would be appropriate.
> Outer joins limits the amount of join orders the planner considers,
so a
> better plan might arise when the join is changed to inner.
I do agree with this assessment. I'm sort of improving the performance
of an existing implementation of ours, for which I'm not aware why
they chose for LEFT OUTER. I did, however, test things with INNER as
well, with the same results, so I decided to stick with what I
encountered in the existing implementation. But it's on my mind as
well ;-)
Thanks for the formatted output. The difference in speed is caused by
the first query that has to read 50k rows from an index, with filter
expression is only eventtype_id = 71, and the second has the extra
knowledge from the scan of events_event_type in the nestloops outer
loop, which returns 0 rows in all cases and is hence a lot faster, even
though that scan is called 16 times.
But the big question is: why does the planner chooses plan 1 in the
first case, or how to fix that? My $0,02 would be to 'help' the planner
find a better plan. Ofcourse you did ANALYZE, but maybe another idea is
to increase the default_statistics_target if it is still at the default
value of 10. More info on
http://www.postgresql.org/docs/8.3/static/runtime-config-query.html. And
also to 'help' the planner: I'd just change the query to an inner join
in this case, since there cannot be null tuples in the right hand side here.
regards,
Yeb Havinga
--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance