Hi all,
I have PostgreSQL 8.1.4 running on a P 4 2.8 GHz , 512 MB with Linux
(Fedora Core 3)
The SQL comands below have a performance diference that I think is not
so much acceptable ( 1035.427 ms vs 7.209 ms ), since the tables isn´t
so much big ( contrato have 1907 rows and prog have 40.002 rows )
Can I make some optimization here ?
EXPLAIN ANALYZE
SELECT Contrato.Id
, Min( prog.dtsemeio ) AS DtSemIni
, Max( prog.dtsemeio ) AS DtSemFim
, Min( prog.dtembarque ) AS DtEmbIni
, Max( prog.dtembarque ) AS DtEmbFim
, Min( prog.dtentrega ) AS DtEntIni
, Max( prog.dtentrega ) AS DtEntFim
, COUNT(prog.*) AS QtSem
, SUM( CASE WHEN Prog.DtSemeio >= '20060814' THEN 1 ELSE 0 END )
AS QtSemAb
FROM bvz.Contrato
LEFT OUTER JOIN bvz.Prog ON prog.Fk_Contrato = Contrato.Id
WHERE Contrato.Fk_Clifor = 243
GROUP BY 1;
QUERY
PLAN
------------------------------------------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=5477.34..5706.84 rows=41 width=48) (actual
time=883.721..1031.159 rows=41 loops=1)
-> Merge Left Join (cost=5477.34..5686.15 rows=860 width=48)
(actual time=868.038..1026.988 rows=1366 loops=1)
Merge Cond: ("outer".id = "inner".fk_contrato)
-> Sort (cost=50.39..50.49 rows=41 width=4) (actual
time=0.614..0.683 rows=41 loops=1)
Sort Key: contrato.id
-> Bitmap Heap Scan on contrato (cost=2.14..49.29
rows=41 width=4) (actual time=0.163..0.508 rows=41 loops=1)
Recheck Cond: (fk_clifor = 243)
-> Bitmap Index Scan on fki_contrato_clifor
(cost=0.00..2.14 rows=41 width=0) (actual time=0.146..0.146 rows=41 loops=1)
Index Cond: (fk_clifor = 243)
-> Sort (cost=5426.95..5526.95 rows=40002 width=48) (actual
time=862.192..956.903 rows=38914 loops=1)
Sort Key: prog.fk_contrato
-> Seq Scan on prog (cost=0.00..1548.02 rows=40002
width=48) (actual time=0.044..169.795 rows=40002 loops=1)
Total runtime: 1035.427 ms
EXPLAIN ANALYZE
SELECT Contrato.Id
, Min( prog.dtsemeio ) AS DtSemIni
, Max( prog.dtsemeio ) AS DtSemFim
, Min( prog.dtembarque ) AS DtEmbIni
, Max( prog.dtembarque ) AS DtEmbFim
, Min( prog.dtentrega ) AS DtEntIni
, Max( prog.dtentrega ) AS DtEntFim
, COUNT(prog.*) AS QtSem
, SUM( CASE WHEN Prog.DtSemeio >= '20060814' THEN 1 ELSE 0 END )
AS QtSemAb
FROM bvz.Contrato
LEFT OUTER JOIN bvz.Prog ON prog.Fk_Contrato = Contrato.Id
WHERE Contrato.Fk_Clifor = 352
GROUP BY 1;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=2.16..4588.74 rows=28 width=48) (actual
time=2.196..7.027 rows=28 loops=1)
-> Nested Loop Left Join (cost=2.16..4574.63 rows=587 width=48)
(actual time=2.042..6.154 rows=223 loops=1)
-> Index Scan using pk_contrato on contrato
(cost=0.00..100.92 rows=28 width=4) (actual time=1.842..3.045 rows=28
loops=1)
Filter: (fk_clifor = 352)
-> Bitmap Heap Scan on prog (cost=2.16..159.19 rows=47
width=48) (actual time=0.040..0.080 rows=8 loops=28)
Recheck Cond: (prog.fk_contrato = "outer".id)
-> Bitmap Index Scan on fki_prog_contrato
(cost=0.00..2.16 rows=47 width=0) (actual time=0.018..0.018 rows=8 loops=28)
Index Cond: (prog.fk_contrato = "outer".id)
Total runtime: 7.209 ms
I think that the problem is in "LEFT OUTER JOIN" because when I run the
queries with a inner join I have more consistent times,
although the query plan above is a champion :
EXPLAIN ANALYZE
SELECT Contrato.Id
, Min( prog.dtsemeio ) AS DtSemIni
, Max( prog.dtsemeio ) AS DtSemFim
, Min( prog.dtembarque ) AS DtEmbIni
, Max( prog.dtembarque ) AS DtEmbFim
, Min( prog.dtentrega ) AS DtEntIni
, Max( prog.dtentrega ) AS DtEntFim
, COUNT(prog.*) AS QtSem
, SUM( CASE WHEN Prog.DtSemeio >= '20060814' THEN 1 ELSE 0 END )
AS QtSemAb
FROM bvz.Contrato
JOIN bvz.Prog ON prog.Fk_Contrato = Contrato.Id
WHERE Contrato.Fk_Clifor = 243
GROUP BY 1;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=1825.38..1826.71 rows=41 width=48) (actual
time=222.671..222.788 rows=41 loops=1)
-> Hash Join (cost=49.40..1806.03 rows=860 width=48) (actual
time=2.040..217.963 rows=1366 loops=1)
Hash Cond: ("outer".fk_contrato = "inner".id)
-> Seq Scan on prog (cost=0.00..1548.02 rows=40002 width=48)
(actual time=0.047..150.636 rows=40002 loops=1)
-> Hash (cost=49.29..49.29 rows=41 width=4) (actual
time=0.766..0.766 rows=41 loops=1)
-> Bitmap Heap Scan on contrato (cost=2.14..49.29
rows=41 width=4) (actual time=0.146..0.669 rows=41 loops=1)
Recheck Cond: (fk_clifor = 243)
-> Bitmap Index Scan on fki_contrato_clifor
(cost=0.00..2.14 rows=41 width=0) (actual time=0.101..0.101 rows=41 loops=1)
Index Cond: (fk_clifor = 243)
Total runtime: 223.230 ms
EXPLAIN ANALYZE
SELECT Contrato.Id
, Min( prog.dtsemeio ) AS DtSemIni
, Max( prog.dtsemeio ) AS DtSemFim
, Min( prog.dtembarque ) AS DtEmbIni
, Max( prog.dtembarque ) AS DtEmbFim
, Min( prog.dtentrega ) AS DtEntIni
, Max( prog.dtentrega ) AS DtEntFim
, COUNT(prog.*) AS QtSem
, SUM( CASE WHEN Prog.DtSemeio >= '20060814' THEN 1 ELSE 0 END )
AS QtSemAb
FROM bvz.Contrato
JOIN bvz.Prog ON prog.Fk_Contrato = Contrato.Id
WHERE Contrato.Fk_Clifor = 352
GROUP BY 1;
QUERY
PLAN
------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=1811.50..1812.41 rows=28 width=48) (actual
time=215.214..215.291 rows=28 loops=1)
-> Hash Join (cost=44.39..1798.29 rows=587 width=48) (actual
time=3.853..214.178 rows=223 loops=1)
Hash Cond: ("outer".fk_contrato = "inner".id)
-> Seq Scan on prog (cost=0.00..1548.02 rows=40002 width=48)
(actual time=0.075..150.701 rows=40002 loops=1)
-> Hash (cost=44.32..44.32 rows=28 width=4) (actual
time=0.248..0.248 rows=28 loops=1)
-> Bitmap Heap Scan on contrato (cost=2.10..44.32
rows=28 width=4) (actual time=0.111..0.187 rows=28 loops=1)
Recheck Cond: (fk_clifor = 352)
-> Bitmap Index Scan on fki_contrato_clifor
(cost=0.00..2.10 rows=28 width=0) (actual time=0.101..0.101 rows=28 loops=1)
Index Cond: (fk_clifor = 352)
Total runtime: 215.483 ms
Well, in this case the queries with LEFT OUTER join and with inner join
returns the same result set. I don´t have the sufficient knowledge to
affirm , but I suspect that if the query plan used for fk_clifor = 352
and with left outer join is applied for the first query (fk_clifor = 243
with left outer join)
we will have a better total runtime.
There are some manner to make this test ?
By the way (If this is a stupid idea, ignore this), this same (or a
similar) query plan cannot be used in the queries with inner join since
the difference in times ( 215.483 ms vs 7.209 ms) still significative ?