On Mon, 5 Jun 2023 at 07:56, gzh <gzhcoder@xxxxxxx> wrote:
Hi everyone,
I'm running into some performance issues with my SQL query.The following SQL query is taking a long time to execute.Execution Plan:explain analyseselect * from TBL_RESleft outer join(select T_CUST.RSNO RSNO2 ,T_CUST.KNO ,T_CUST.AGE ,T_CUST.GSTfrom TBL_CUST T_CUST ,(select T_CUST.RSNO ,T_CUST.KNO ,MIN(T_CUST.GSTSEQ) GSTSEQfrom TBL_CUST T_CUST ,TBL_POV T_POV ,TBL_RES T_RESwhere T_CUST.STSFLG = 'T'and T_CUST.DISPSEQ <> 9999AND T_CUST.KFIX = '0'and T_POV.CRSNO = T_RES.CRSNOand T_RES.RSNO = T_CUST.RSNOgroup by T_CUST.RSNO , T_CUST.KNO) T_POV2where T_POV2.RSNO = T_CUST.RSNOand T_POV2.KNO = T_CUST.KNOand T_POV2.GSTSEQ = T_CUST.GSTSEQ) T_POV3 on TBL_RES.RSNO = T_POV3.RSNO2and TBL_RES.KNO = T_POV3.KNOwhere TBL_RES.CID >= to_date('2022/07/01', 'YYYY/MM/DD')and TBL_RES.CID <= to_date('2022/07/31', 'YYYY/MM/DD')and TBL_RES.COD >= to_date('2022/07/01', 'YYYY/MM/DD')and TBL_RES.COD <= to_date('2022/07/31', 'YYYY/MM/DD')----- Execution Plan -----Nested Loop Left Join (cost=254388.44..452544.70 rows=473 width=3545) (actual time=3077.312..996048.714 rows=15123 loops=1)Join Filter: ((TBL_RES.RSNO = T_CUST.RSNO) AND ((TBL_RES.KNO)::text = (T_CUST.KNO)::text))Rows Removed by Join Filter: 4992268642-> Gather (cost=1000.00..58424.35 rows=473 width=3489) (actual time=0.684..14.158 rows=15123 loops=1)Workers Planned: 2Workers Launched: 2-> Parallel Seq Scan on TBL_RES (cost=0.00..57377.05 rows=197 width=3489) (actual time=0.096..279.504 rows=5041 loops=3)Filter: ((CID >= to_date('2022/07/01'::text, 'YYYY/MM/DD'::text)) AND (CID <= to_date('2022/07/31'::text, 'YYYY/MM/DD'::text)) AND (COD >= to_date('2022/07/01'::text, 'YYYY/MM/DD'::text)) AND (COD <= to_date('2022/07/31'::text, 'YYYY/MM/DD'::text)))Rows Removed by Filter: 161714-> Materialize (cost=253388.44..394112.08 rows=1 width=56) (actual time=0.081..26.426 rows=330111 loops=15123)-> Hash Join (cost=253388.44..394112.07 rows=1 width=56) (actual time=1197.484..2954.084 rows=330111 loops=1)Hash Cond: ((T_CUST.RSNO = T_CUST_1.RSNO) AND ((T_CUST.KNO)::text = (T_CUST_1.KNO)::text) AND (T_CUST.gstseq = (min(T_CUST_1.gstseq))))-> Seq Scan on TBL_CUST T_CUST (cost=0.00..79431.15 rows=2000315 width=61) (actual time=0.015..561.005 rows=2000752 loops=1)-> Hash (cost=246230.90..246230.90 rows=262488 width=50) (actual time=1197.025..1209.957 rows=330111 loops=1)Buckets: 65536 Batches: 8 Memory Usage: 2773kB-> Finalize GroupAggregate (cost=205244.84..243606.02 rows=262488 width=50) (actual time=788.552..1116.074 rows=330111 loops=1)Group Key: T_CUST_1.RSNO, T_CUST_1.KNO-> Gather Merge (cost=205244.84..238964.80 rows=268846 width=50) (actual time=788.547..982.479 rows=330111 loops=1)Workers Planned: 2Workers Launched: 1-> Partial GroupAggregate (cost=204244.81..206933.27 rows=134423 width=50) (actual time=784.032..900.979 rows=165056 loops=2)Group Key: T_CUST_1.RSNO, T_CUST_1.KNO-> Sort (cost=204244.81..204580.87 rows=134423 width=23) (actual time=784.019..833.791 rows=165061 loops=2)Sort Key: T_CUST_1.RSNO, T_CUST_1.KNOSort Method: external merge Disk: 5480kBWorker 0: Sort Method: external merge Disk: 5520kB-> Parallel Hash Join (cost=111758.80..190036.38 rows=134423 width=23) (actual time=645.302..716.247 rows=165061 loops=2)Hash Cond: (T_CUST_1.RSNO = T_RES.RSNO)-> Parallel Seq Scan on TBL_CUST T_CUST_1 (cost=0.00..74013.63 rows=204760 width=23) (actual time=0.018..264.390 rows=165058 loops=2)Filter: ((dispseq <> '9999'::numeric) AND ((stsflg)::text = 'T'::text) AND ((KFIX)::text = '0'::text))Rows Removed by Filter: 835318-> Parallel Hash (cost=109508.52..109508.52 rows=137142 width=8) (actual time=343.593..343.896 rows=165058 loops=2)Buckets: 131072 Batches: 8 Memory Usage: 3008kB-> Parallel Hash Join (cost=51834.70..109508.52 rows=137142 width=8) (actual time=256.732..314.368 rows=165058 loops=2)Hash Cond: ((T_RES.crsno)::text = (T_POV.crsno)::text)-> Parallel Seq Scan on TBL_RES T_RES (cost=0.00..53199.02 rows=208902 width=17) (actual time=0.007..100.510 rows=250132 loops=2)-> Parallel Hash (cost=49450.42..49450.42 rows=137142 width=9) (actual time=122.308..122.309 rows=165054 loops=2)Buckets: 131072 Batches: 8 Memory Usage: 2976kB-> Parallel Seq Scan on TBL_POV T_POV (cost=0.00..49450.42 rows=137142 width=9) (actual time=0.037..89.470 rows=165054 loops=2)Planning Time: 1.064 msExecution Time: 996062.382 ms--------------------------------------------------------------------------------The amount of data in the table is as follows.TBL_RES 500265TBL_CUST 2000752TBL_POV 330109Any suggestions for improving the performance of the query would be greatly appreciated.Thanks in advance!
Welcome to query optimisation, I hope you'll enjoy working on this problem, it's one of the things I and many others love.
From a quick glance, it looks like there are opportunities for index use there but the plan doesn't contain any. Do any indexes exist?
You could also start looking at it in parts, e.g. separate out the inner T_POV2 subselect. It looks like that contains a cross join, which is then whittled down with WHERE clauses. You could try adding ON to the FROM instead. https://www.postgresql.org/docs/15/sql-select.html#SQL-FROM
Oliver