I'm having a strange performance issue with two almost similar
queries, the
one running as expected, the other one taking far more time. The only
difference is that I have "uniid in (10)" in the normally running
query and
"uniid in (9,10)" in the other one. The number of rows resulting
from the
respective table differs not very much being 406 for the first and
511 for
the second query.
This is the full query - the "uniid in (9,10)" is in the last
subquery:
SELECT 'Anzahl' AS column1, count(DISTINCT sid) AS column2
FROM (
SELECT sid
FROM stud
WHERE stud.status > 0
AND length(stud.vname) > 1
AND length(stud.nname) > 1
) AS qur_filter_1 INNER JOIN (
SELECT DISTINCT sid
FROM stud_vera
INNER JOIN phon USING (sid)
WHERE veraid = 22
AND stud_vera.status > 0
AND (
(
veraid IN (2, 3, 22, 24, 36)
AND phontyp = 5
AND phon.typ = 1
AND phon.status > 0
) OR (
veraid NOT IN (2, 3, 22, 24, 36)
)
)
) AS qur_filter_2 USING (sid) INNER JOIN (
SELECT DISTINCT sid
FROM ausb
INNER JOIN uni USING (uniid)
WHERE uni.uniort IN ('Augsburg')
AND ausb.overview = 1
AND ausb.zweitstudium != 2
AND ausb.status > 0
) AS qur_filter_3 USING (sid) INNER JOIN (
SELECT DISTINCT sid
FROM ausb
WHERE uniid IN (9, 10)
AND ausb.overview = 1
AND ausb.zweitstudium != 2
AND ausb.status > 0
) AS qur_filter_4 USING (sid)
These are the query-plans for both queries, first the problematic one:
Aggregate (cost=78785.78..78785.79 rows=1 width=4) (actual
time=698777.890..698777.891 rows=1 loops=1)
-> Nested Loop (cost=65462.58..78785.78 rows=1 width=4) (actual
time=6743.856..698776.957 rows=250 loops=1)
Join Filter: ("outer".sid = "inner".sid)
-> Merge Join (cost=11031.79..11883.12 rows=1 width=12)
(actual
time=387.837..433.612 rows=494 loops=1)
Merge Cond: ("outer".sid = "inner".sid)
-> Nested Loop (cost=5643.11..6490.17 rows=19 width=8)
(actual time=114.323..154.043 rows=494 loops=1)
-> Unique (cost=5643.11..5645.35 rows=180
width=4)
(actual time=114.202..116.002 rows=511 loops=1)
-> Sort (cost=5643.11..5644.23 rows=448
width=4)
(actual time=114.199..114.717 rows=511 loops=1)
Sort Key: public.ausb.sid
-> Seq Scan on ausb
(cost=0.00..5623.38
rows=448 width=4) (actual time=0.351..112.459 rows=511 loops=1)
Filter: (((uniid = 9) OR
(uniid = 10))
AND (overview = 1) AND (zweitstudium <> 2) AND (status > 0))
-> Index Scan using stud_pkey on stud
(cost=0.00..4.67
rows=1 width=4) (actual time=0.062..0.067 rows=1 loops=511)
Index Cond: (stud.sid = "outer".sid)
Filter: ((status > 0) AND (length
((vname)::text) >
1) AND (length((nname)::text) > 1))
-> Materialize (cost=5388.68..5392.05 rows=337
width=4)
(actual time=273.506..276.785 rows=511 loops=1)
-> Unique (cost=5383.29..5384.98 rows=337
width=4)
(actual time=273.501..275.421 rows=511 loops=1)
-> Sort (cost=5383.29..5384.13 rows=337
width=4)
(actual time=273.499..274.091 rows=511 loops=1)
Sort Key: public.ausb.sid
-> Hash Join (cost=17.61..5369.14
rows=337
width=4) (actual time=1.139..272.465 rows=511 loops=1)
Hash Cond: ("outer".uniid =
"inner".uniid)
-> Seq Scan on ausb
(cost=0.00..4827.30 rows=104174 width=8) (actual time=0.026..200.111
rows=103593 loops=1)
Filter: ((overview = 1)
AND
(zweitstudium <> 2) AND (status > 0))
-> Hash (cost=17.60..17.60
rows=2
width=4) (actual time=0.435..0.435 rows=2 loops=1)
-> Seq Scan on uni
(cost=0.00..17.60 rows=2 width=4) (actual time=0.412..0.424 rows=2
loops=1)
Filter:
((uniort)::text =
'Augsburg'::text)
-> Unique (cost=54430.79..66664.18 rows=10599 width=4)
(actual
time=6.851..1374.135 rows=40230 loops=494)
-> Merge Join (cost=54430.79..66319.65 rows=137811
width=4)
(actual time=6.849..1282.333 rows=40233 loops=494)
Merge Cond: ("outer".sid = "inner".sid)
Join Filter: (((("outer".veraid = 2) OR
("outer".veraid
= 3) OR ("outer".veraid = 22) OR ("outer".veraid = 24) OR
("outer".veraid =
36)) AND ("inner".phontyp = 5) AND ("inner".typ = 1) AND
("inner".status >
0)) OR (("outer".veraid <> 2) AND ("outer".veraid <> 3) AND
("outer".veraid
<> 22) AND ("outer".veraid <> 24) AND ("outer".veraid <> 36)))
-> Sort (cost=11962.11..12098.59 rows=54593
width=8)
(actual time=0.547..46.482 rows=53354 loops=494)
Sort Key: stud_vera.sid
-> Bitmap Heap Scan on stud_vera
(cost=2239.14..7666.61 rows=54593 width=8) (actual
time=43.096..165.300
rows=53354 loops=1)
Recheck Cond: (veraid = 22)
Filter: (status > 0)
-> Bitmap Index Scan on
stud_vera_sid_veraid_idx (cost=0.00..2239.14 rows=58765 width=0)
(actual
time=41.242..41.242 rows=61855 loops=1)
Index Cond: (veraid = 22)
-> Sort (cost=42468.68..43407.53 rows=375539
width=12)
(actual time=6.297..533.711 rows=375539 loops=494)
Sort Key: phon.sid
-> Seq Scan on phon (cost=0.00..7696.39
rows=375539 width=12) (actual time=0.048..544.999 rows=375539 loops=1)
!!!! The query-plan for the normally running query: !!!!!
Aggregate (cost=77846.97..77846.98 rows=1 width=4) (actual
time=5488.913..5488.913 rows=1 loops=1)
-> Nested Loop (cost=65471.70..77846.97 rows=1 width=4) (actual
time=3913.839..5488.513 rows=208 loops=1)
Join Filter: ("outer".sid = "inner".sid)
-> Merge Join (cost=60088.41..72454.41 rows=1 width=12)
(actual
time=3598.105..4841.242 rows=208 loops=1)
Merge Cond: ("outer".sid = "inner".sid)
-> Unique (cost=54430.79..66664.18 rows=10599 width=4)
(actual time=3479.029..4697.051 rows=40129 loops=1)
-> Merge Join (cost=54430.79..66319.65
rows=137811
width=4) (actual time=3479.027..4616.245 rows=40132 loops=1)
Merge Cond: ("outer".sid = "inner".sid)
Join Filter: (((("outer".veraid = 2) OR
("outer".veraid = 3) OR ("outer".veraid = 22) OR ("outer".veraid =
24) OR
("outer".veraid = 36)) AND ("inner".phontyp = 5) AND ("inner".typ =
1) AND
("inner".status > 0)) OR (("outer".veraid <> 2) AND ("outer".veraid
<> 3)
AND ("outer".veraid <> 22) AND ("outer".veraid <> 24) AND
("outer".veraid <>
36)))
-> Sort (cost=11962.11..12098.59
rows=54593
width=8) (actual time=274.248..315.052 rows=53252 loops=1)
Sort Key: stud_vera.sid
-> Bitmap Heap Scan on stud_vera
(cost=2239.14..7666.61 rows=54593 width=8) (actual
time=46.669..167.599
rows=53352 loops=1)
Recheck Cond: (veraid = 22)
Filter: (status > 0)
-> Bitmap Index Scan on
stud_vera_sid_veraid_idx (cost=0.00..2239.14 rows=58765 width=0)
(actual
time=44.618..44.618 rows=61857 loops=1)
Index Cond: (veraid = 22)
-> Sort (cost=42468.68..43407.53
rows=375539
width=12) (actual time=3204.729..3681.598 rows=375090 loops=1)
Sort Key: phon.sid
-> Seq Scan on phon
(cost=0.00..7696.39
rows=375539 width=12) (actual time=0.052..628.838 rows=375539 loops=1)
-> Materialize (cost=5657.62..5657.71 rows=9 width=8)
(actual time=91.290..105.557 rows=406 loops=1)
-> Nested Loop (cost=5234.08..5657.61 rows=9
width=8)
(actual time=91.282..104.571 rows=406 loops=1)
-> Unique (cost=5234.08..5235.20 rows=90
width=4) (actual time=91.156..92.232 rows=420 loops=1)
-> Sort (cost=5234.08..5234.64
rows=224
width=4) (actual time=91.154..91.484 rows=420 loops=1)
Sort Key: public.ausb.sid
-> Seq Scan on ausb
(cost=0.00..5225.34 rows=224 width=4) (actual time=0.266..90.242
rows=420
loops=1)
Filter: ((uniid = 10) AND
(overview = 1) AND (zweitstudium <> 2) AND (status > 0))
-> Index Scan using stud_pkey on stud
(cost=0.00..4.67 rows=1 width=4) (actual time=0.024..0.026 rows=1
loops=420)
Index Cond: (stud.sid = "outer".sid)
Filter: ((status > 0) AND
(length((vname)::text) > 1) AND (length((nname)::text) > 1))
-> Unique (cost=5383.29..5384.98 rows=337 width=4) (actual
time=1.520..2.686 rows=511 loops=208)
-> Sort (cost=5383.29..5384.13 rows=337 width=4)
(actual
time=1.519..1.871 rows=511 loops=208)
Sort Key: public.ausb.sid
-> Hash Join (cost=17.61..5369.14 rows=337
width=4)
(actual time=1.133..314.584 rows=511 loops=1)
Hash Cond: ("outer".uniid = "inner".uniid)
-> Seq Scan on ausb (cost=0.00..4827.30
rows=104174 width=8) (actual time=0.030..226.532 rows=103593 loops=1)
Filter: ((overview = 1) AND
(zweitstudium <>
2) AND (status > 0))
-> Hash (cost=17.60..17.60 rows=2 width=4)
(actual time=0.392..0.392 rows=2 loops=1)
-> Seq Scan on uni (cost=0.00..17.60
rows=2 width=4) (actual time=0.369..0.381 rows=2 loops=1)
Filter: ((uniort)::text =
'Augsburg'::text)
The estimated row numbers are not bad as long as one table is
affected.
They're much worse as soon as two or more tables are joined. Though
the
query plans are slightly different, the number of merged rows at
different
stages seems to be rather the same for both plans. The big
difference in my
eyes seems the cost for the first nested loop. This seems to be the
point,
where the long running query consumes most time. I've then set
enable_nestloop to off, and actually the problem disappears.
Other maybe relevant parameters:
default_statistics_target = 100
work_mem = 4096
max_fsm_pages = 100000
My questions:
What could be the problem behind high amount of actually used time
for the
nested loop in the first query?
If we decided to constantly turn off nested loops, what side
effects would
we have to expect?
Are there more granular ways to tell the query planner when to use
nested
loops?
Or just other ideas what to do? We'd be grateful for any hint!