Re: How to improve the performance of my SQL query？

gzh <gzhcoder@xxxxxxx> · Mon, 24 Jul 2023 17:54:19 +0800 (CST)



>Did you change any parameters that have an impact on query planning?
>You can see that in the output of EXPLAIN (ANALYZE, BUFFERS, SETTINGS).
I added some parameters and re-executed the Execution Plan.
Except for the index not taking effect, I still don't know the reason why the index is not working. 
Is it because there is too much data that meets the conditions?

EXPLAIN (ANALYZE ON, VERBOSE ON, COSTS ON, BUFFERS ON) 
select COUNT(ET_CD)
from TBL_SHA
WHERE TBL_SHA.MS_CD = 'MLD009'
and TBL_SHA.ETRYS in
   (select TBL_INF.RY_CD
    from TBL_INF
    WHERE TBL_INF.MS_CD = 'MLD009'
   AND TBL_INF.RY_CD = '00000001'
   )

----- Execution Plan -----
Limit  (cost=2728633.22..2728633.23 rows=1 width=8) (actual time=128691.521..128717.677 rows=1 loops=1)
  Output: (count(tbl_sha.et_cd))
  Buffers: shared hit=58948 read=2112758
  I/O Timings: read=357249.120
  ->  Aggregate  (cost=2728633.22..2728633.23 rows=1 width=8) (actual time=128691.519..128717.674 rows=1 loops=1)
        Output: count(tbl_sha.et_cd)
        Buffers: shared hit=58948 read=2112758
        I/O Timings: read=357249.120
        ->  Nested Loop  (cost=1000.29..2722556.76 rows=2430587 width=9) (actual time=2.364..128350.279 rows=2613500 loops=1)
              Output: tbl_sha.et_cd
              Buffers: shared hit=58948 read=2112758
              I/O Timings: read=357249.120
              ->  Index Only Scan using tbl_inf_pkc on mtpdb.tbl_inf  (cost=0.29..8.31 rows=1 width=9) (actual time=0.046..0.051 rows=1 loops=1)
                    Output: tbl_inf.ms_cd, tbl_inf.ry_cd
                    Index Cond: ((tbl_inf.ms_cd = 'MLD009'::bpchar) AND (tbl_inf.ry_cd = '00000001'::bpchar))
                    Heap Fetches: 1
                    Buffers: shared hit=4
              ->  Gather  (cost=1000.00..2698242.58 rows=2430587 width=18) (actual time=2.315..127773.087 rows=2613500 loops=1)
                    Output: tbl_sha.et_cd, tbl_sha.etrys
                    Workers Planned: 2
                    Workers Launched: 2
                    Buffers: shared hit=58944 read=2112758
                    I/O Timings: read=357249.120
                    ->  Parallel Seq Scan on mtpdb.tbl_sha  (cost=0.00..2454183.88 rows=1012745 width=18) (actual time=952.728..127583.089 rows=871167 loops=3)
                          Output: tbl_sha.et_cd, tbl_sha.etrys
                          Filter: ((tbl_sha.ms_cd = 'MLD009'::bpchar) AND (tbl_sha.etrys = '00000001'::bpchar))
                          Rows Removed by Filter: 14062278
                          Buffers: shared hit=58944 read=2112758
                          I/O Timings: read=357249.120
                          Worker 0:  actual time=1432.292..127762.181 rows=988036 loops=1
                            Buffers: shared hit=17875 read=706862
                            I/O Timings: read=119193.744
                          Worker 1:  actual time=1425.878..127786.777 rows=992381 loops=1
                            Buffers: shared hit=19813 read=706359
                            I/O Timings: read=119386.899
Planning:
  Buffers: shared hit=42
Planning Time: 1.024 ms
Execution Time: 128717.731 ms


At 2023-07-24 13:43:46, "Laurenz Albe" <laurenz.albe@xxxxxxxxxxx> wrote:
>On Fri, 2023-07-21 at 09:43 +0800, gzh wrote:
>> The definitions of the columns used in SQL are as follows.
>> 
>> TBL_SHA
>> 
>> ms_cd character(6) NOT NULL       -- PRIMARY KEY
>> et_cd character(8)
>> etrys character(8)
>> 
>> TBL_INF
>> 
>> ms_cd character(6) NOT NULL       -- PRIMARY KEY
>> ry_cd character(8) NOT NULL       -- PRIMARY KEY
>> 
>> I made some modifications to the data, and I realized that I should not change the length of the data. 
>> The actual data and its corresponding execution plan are shown below.
>> 
>> explain analyze
>> select COUNT(ET_CD)
>> from TBL_SHA
>> WHERE TBL_SHA.MS_CD = 'MLD009'
>> and TBL_SHA.ETRYS in
>>    (select TBL_INF.RY_CD
>>     from TBL_INF
>>     WHERE TBL_INF.MS_CD = 'MLD009'
>>    AND TBL_INF.RY_CD = '00000001'
>>    )
>> ----- Execution Plan -----
>> Limit  (cost=2738709.57..2738709.58 rows=1 width=8) (actual time=124168.769..124168.771 rows=1 loops=1)
>>   ->  Aggregate  (cost=2738709.57..2738709.58 rows=1 width=8) (actual time=124168.767..124168.769 rows=1 loops=1)
>>         ->  Nested Loop  (cost=0.29..2730702.63 rows=3202774 width=9) (actual time=97264.166..123920.769 rows=3200000 loops=1)
>>               ->  Index Only Scan using TBL_INF_pkc on TBL_INF  (cost=0.29..8.31 rows=1 width=9) (actual time=0.025..0.030 rows=1 loops=1)
>>                     Index Cond: ((MS_CD = 'MLD009'::bpchar) AND (RY_CD = '00000001'::bpchar))
>>                     Heap Fetches: 1
>>               ->  Seq Scan on TBL_SHA  (cost=0.00..2698666.58 rows=3202774 width=18) (actual time=97264.138..123554.792 rows=3200000 loops=1)
>>                     Filter: ((MS_CD = 'MLD009'::bpchar) AND (ETRYS = '00000001'::bpchar))
>>                     Rows Removed by Filter: 32000325
>> Planning Time: 0.162 ms
>> Execution Time: 124168.838 ms
>
>Thanks.  That should definitely use a b-tree index defined on (ms_cd, etrsys).
>
>Did you change any parameters that have an impact on query planning?
>You can see that in the output of EXPLAIN (ANALYZE, BUFFERS, SETTINGS).
>
>Yours,
>Laurenz Albe