Em 22/11/2019 14:55, Pavel Stehule
escreveu:
Hey,
I'm trying to figure out why Postgres is choosing a
Hash Join over a Nested Loop in this query:
SELECT T1.PesID, T1.PesNom, T1.PesValSal,
T1.PesCPFCNPJ, T2.CarAti, T1.CarCod, T1.EmpCod,
T2.CarFan, T1.PesDatAge, T1.PesCod, COALESCE(
T3.PesDatAnt, DATE '00010101') AS PesDatAnt
FROM ((public.Pessoa T1
INNER JOIN public.Carteira T2 ON
T2.EmpCod = T1.EmpCod AND T2.CarCod = T1.CarCod)
LEFT JOIN (SELECT MIN(COALESCE(
T5.ConVenAnt, DATE '00010101')) AS PesDatAnt,
T4.EmpCod, T4.CarCod, T4.ConPesCod AS ConPesCod
FROM
(public.Contrato T4
LEFT JOIN
(SELECT MIN(ConParDatVen) AS ConVenAnt, EmpCod,
CarCod, ConPesCod, ConSeq
FROM public.ContratoParcela T5
WHERE ConParAti = true
AND ConParValSal > 0
GROUP BY EmpCod, CarCod, ConPesCod, ConSeq )
T5 ON T5.EmpCod = T4.EmpCod AND
T5.CarCod = T4.CarCod AND
T5.ConPesCod = T4.ConPesCod AND
T5.ConSeq = T4.ConSeq)
WHERE T4.ConAti =
TRUE
GROUP BY
T4.EmpCod, T4.CarCod, T4.ConPesCod ) T3 ON
t3.EmpCod = T1.EmpCod AND
t3.CarCod = T1.CarCod
AND
t3.ConPesCod = T1.PesCod)
WHERE (T2.CarAti = true)
AND (T1.EmpCod = 112)
and (UPPER(T1.PesNom) like
UPPER('%MARIA%'))
ORDER BY T1.EmpCod, T1.CarCod, T1.PesCod
Here the Hash Join[1] plan takes ~700ms, and if I
change the first LEFT JOIN to a LEFT JOIN LATERAL,
forcing a nested loop, the query[2] runs in 3ms.
[1] https://explain.depesz.com/s/8IL3
[2] https://explain.depesz.com/s/f8Q9
Maybe I am wrong, but probably you have to do more than
just change LEFT JOIN to LATERAL JOIN. Lateral join is based
on correlated subquery - so you had to push some predicates
to subquery - and then the query can be much more effective.
Regards
Pavel
PostgreSQL version is 11.5, I have run analyze on all the
tables.
PG settings:
name |setting |unit|
-------------------------------|---------|----|
autovacuum |on | |
default_statistics_target |250 | |
effective_cache_size |983040 |8kB |
effective_io_concurrency |200 | |
max_parallel_workers |6 | |
max_parallel_workers_per_gather|3 | |
random_page_cost |1.1 | |
work_mem |51200 |kB |
I'm sorry, I am not sure I understood.
This is the altered query:
SELECT T1.PesID, T1.PesNom, T1.PesValSal, T1.PesCPFCNPJ,
T2.CarAti, T1.CarCod, T1.EmpCod, T2.CarFan, T1.PesDatAge,
T1.PesCod,
COALESCE( T3.PesDatAnt, DATE '00010101') AS
PesDatAnt
FROM ((public.Pessoa T1
INNER JOIN public.Carteira T2 ON T2.EmpCod =
T1.EmpCod AND T2.CarCod = T1.CarCod)
LEFT JOIN LATERAL (SELECT
MIN(COALESCE( T5.ConVenAnt, DATE '00010101')) AS PesDatAnt,
T4.EmpCod, T4.CarCod, T4.ConPesCod AS ConPesCod
FROM (public.Contrato T4
LEFT JOIN (SELECT
MIN(ConParDatVen) AS ConVenAnt, EmpCod, CarCod, ConPesCod, ConSeq
FROM
public.ContratoParcela T5
WHERE
ConParAti = true
and
ConParValSal > 0
GROUP
BY EmpCod, CarCod, ConPesCod, ConSeq ) T5 ON T5.EmpCod =
T4.EmpCod AND T5.CarCod = T4.CarCod AND T5.ConPesCod =
T4.ConPesCod AND T5.ConSeq = T4.ConSeq)
WHERE T4.ConAti = TRUE
AND t4.EmpCod =
T1.EmpCod AND t4.CarCod = T1.CarCod AND t4.ConPesCod = T1.PesCod
GROUP BY T4.EmpCod,
T4.CarCod, T4.ConPesCod ) T3 ON TRUE ) --ON t3.EmpCod =
T1.EmpCod AND t3.CarCod = T1.CarCod AND t3.ConPesCod =
T1.PesCod)
WHERE (T2.CarAti = true)
AND (T1.EmpCod = 112)
and (UPPER(T1.PesNom) like UPPER('%MARIA%'))
ORDER BY T1.EmpCod, T1.CarCod, T1.PesCod
In bold are the changes I've made to the query. I am sure
PostgreSQL is able to push it down, since it is much faster now.
The problem I have is that this is a query generated by an ORM,
So I can't change it.
I would like to understand why wasn't Postgres able to optimize
it to a nested loop. Is there something I can do with the
statistics?
Thanks!!
|