Hello All,
We have the below query which is running for ~45 seconds on postgres aurora reader instance. I have captured the explain analyze. Want to understand, where exactly the resources are getting spent and if we can be able to optimize it further. It's a UI query showing top 50 rows and is supposed to finish in sub seconds but it takes around ~45 seconds to finish.
We have the below query which is running for ~45 seconds on postgres aurora reader instance. I have captured the explain analyze. Want to understand, where exactly the resources are getting spent and if we can be able to optimize it further. It's a UI query showing top 50 rows and is supposed to finish in sub seconds but it takes around ~45 seconds to finish.
Also seeing multiple workers in the plan, does that mean the query is running in parallel somehow?
explain (analyze,verbose,costs,buffers) select TAB1.PRI from SCHEMA1.TAB1 TAB1
inner join SCHEMA1.TAB4 TAB4 on TAB4.PRI = TAB1.PRI
inner join SCHEMA1."TAB2" TAB2 on TAB2.PRI = TAB1.PRI
inner join SCHEMA1.TAB3 a2 on a2.AID = TAB2.AID
where TAB2.MID = XXXXX and TAB4.TAB4_code = 'XX'
and TAB2.TAB2_time between '2024-01-01' and '2024-01-31'
order by TAB2.TAB2_time desc
limit 50;
inner join SCHEMA1.TAB4 TAB4 on TAB4.PRI = TAB1.PRI
inner join SCHEMA1."TAB2" TAB2 on TAB2.PRI = TAB1.PRI
inner join SCHEMA1.TAB3 a2 on a2.AID = TAB2.AID
where TAB2.MID = XXXXX and TAB4.TAB4_code = 'XX'
and TAB2.TAB2_time between '2024-01-01' and '2024-01-31'
order by TAB2.TAB2_time desc
limit 50;
Limit (cost=13052924.01..13052924.14 rows=50 width=45) (actual time=45211.971..45224.720 rows=50 loops=1)
" Output: TAB1.PRI, TAB2.TAB2_time"
Buffers: shared hit=1980943 read=2335820
I/O Timings: shared/local read=112477.014
-> Sort (cost=13052924.01..13052924.19 rows=70 width=45) (actual time=45211.969..45224.713 rows=50 loops=1)
" Output: TAB1.PRI, TAB2.TAB2_time"
Sort Key: TAB2.TAB2_time DESC
Sort Method: top-N heapsort Memory: 32kB
Buffers: shared hit=1980943 read=2335820
I/O Timings: shared/local read=112477.014
-> Gather (cost=92917.38..13052921.87 rows=70 width=45) (actual time=947.004..45221.915 rows=5428 loops=1)
" Output: TAB1.PRI, TAB2.TAB2_time"
Workers Planned: 4
Workers Launched: 4
Buffers: shared hit=1980943 read=2335820
I/O Timings: shared/local read=112477.014
-> Nested Loop (cost=91917.38..13051914.87 rows=18 width=45) (actual time=945.946..45195.224 rows=1086 loops=5)
" Output: TAB1.PRI, TAB2.TAB2_time"
Inner Unique: true
Buffers: shared hit=1980943 read=2335820
I/O Timings: shared/local read=112477.014
Worker 0: actual time=936.808..45193.518 rows=1036 loops=1
Buffers: shared hit=382606 read=465076
I/O Timings: shared/local read=22452.028
Worker 1: actual time=947.246..45194.168 rows=1055 loops=1
Buffers: shared hit=383165 read=484189
I/O Timings: shared/local read=22617.135
Worker 2: actual time=933.623..45192.534 rows=1145 loops=1
Buffers: shared hit=415758 read=473182
I/O Timings: shared/local read=22741.488
Worker 3: actual time=965.639..45193.603 rows=1078 loops=1
Buffers: shared hit=398009 read=449053
I/O Timings: shared/local read=22221.094
-> Nested Loop (cost=91916.81..13051828.80 rows=18 width=81) (actual time=945.917..43729.931 rows=1086 loops=5)
" Output: TAB1.PRI, TAB2.TAB2_time, TAB2.AID"
Inner Unique: true
Join Filter: ((TAB4.PRI)::text = (TAB1.PRI)::text)
Buffers: shared hit=1962289 read=2328363
I/O Timings: shared/local read=105246.220
Worker 0: actual time=936.781..43732.652 rows=1036 loops=1
Buffers: shared hit=379077 read=463587
I/O Timings: shared/local read=21008.508
Worker 1: actual time=947.212..43699.507 rows=1055 loops=1
Buffers: shared hit=379573 read=482704
I/O Timings: shared/local read=21142.572
Worker 2: actual time=933.589..43696.710 rows=1145 loops=1
Buffers: shared hit=411836 read=471634
I/O Timings: shared/local read=21266.581
Worker 3: actual time=965.608..43768.535 rows=1078 loops=1
Buffers: shared hit=394288 read=447583
I/O Timings: shared/local read=20814.288
-> Parallel Hash Join (cost=91916.24..13051765.39 rows=18 width=117) (actual time=945.879..42758.939 rows=1086 loops=5)
" Output: TAB4.PRI, TAB2.TAB2_time, TAB2.PRI, TAB2.AID"
Hash Cond: ((TAB2.PRI)::text = (TAB4.PRI)::text)
Buffers: shared hit=1943792 read=2322814
I/O Timings: shared/local read=100496.787
Worker 0: actual time=936.743..42798.247 rows=1036 loops=1
Buffers: shared hit=375573 read=462501
I/O Timings: shared/local read=20094.654
Worker 1: actual time=947.169..42752.987 rows=1055 loops=1
Buffers: shared hit=375975 read=481619
I/O Timings: shared/local read=20216.926
Worker 2: actual time=933.545..42660.854 rows=1145 loops=1
Buffers: shared hit=407956 read=470465
I/O Timings: shared/local read=20252.386
Worker 3: actual time=965.567..42797.288 rows=1078 loops=1
Buffers: shared hit=390609 read=446481
I/O Timings: shared/local read=19863.965
" -> Parallel Bitmap Heap Scan on SCHEMA1.""TAB2"" TAB2 (cost=84860.50..13040301.00 rows=1175611 width=80) (actual time=713.054..26942.082 rows=956249 loops=5)"
" Output: TAB2.TAB2_time, TAB2.PRI, TAB2.AID"
Recheck Cond: (TAB2.MID = 'XXXXX'::numeric)
Rows Removed by Index Recheck: 2137395
Filter: ((TAB2.TAB2_time >= '2024-01-01 00:00:00+00'::timestamp with time zone) AND (TAB2.TAB2_time <= '2024-01-31 00:00:00+00'::timestamp with time zone))
Heap Blocks: exact=5300 lossy=782577
Buffers: shared hit=1651569 read=2245157
I/O Timings: shared/local read=29063.286
Worker 0: actual time=713.040..27006.980 rows=942051 loops=1
Buffers: shared hit=317611 read=447013
I/O Timings: shared/local read=5851.688
Worker 1: actual time=713.047..27065.878 rows=939696 loops=1
Buffers: shared hit=317632 read=466176
I/O Timings: shared/local read=6038.851
Worker 2: actual time=713.027..26894.506 rows=967468 loops=1
Buffers: shared hit=349596 read=454912
I/O Timings: shared/local read=5962.348
Worker 3: actual time=713.091..26826.767 rows=961928 loops=1
Buffers: shared hit=332980 read=430848
I/O Timings: shared/local read=5426.475
-> Bitmap Index Scan on TAB2_idx2 (cost=0.00..83684.89 rows=4702443 width=0) (actual time=688.661..688.661 rows=4781245 loops=1)
Index Cond: (TAB2.MID = 'XXXXX'::numeric)
Buffers: shared hit=12408
Worker 2: actual time=688.661..688.661 rows=4781245 loops=1
Buffers: shared hit=12408
-> Parallel Hash (cost=7042.63..7042.63 rows=1049 width=37) (actual time=217.987..217.988 rows=27613 loops=5)
Output: TAB4.PRI
Buckets: 262144 (originally 2048) Batches: 1 (originally 1) Memory Usage: 13936kB
Buffers: shared hit=134917
Worker 0: actual time=214.981..214.982 rows=27779 loops=1
Buffers: shared hit=27133
Worker 1: actual time=215.455..215.456 rows=27805 loops=1
Buffers: shared hit=27159
Worker 2: actual time=215.774..215.774 rows=27330 loops=1
Buffers: shared hit=26706
Worker 3: actual time=215.776..215.777 rows=26880 loops=1
Buffers: shared hit=26245
-> Parallel Bitmap Heap Scan on SCHEMA1.TAB4 TAB4 (cost=26.39..7042.63 rows=1049 width=37) (actual time=23.650..201.606 rows=27613 loops=5)
Output: TAB4.PRI
Recheck Cond: ((TAB4.TAB4_code)::text = 'XX'::text)
Rows Removed by Index Recheck: 616610
Heap Blocks: exact=11978 lossy=15624
Buffers: shared hit=134917
Worker 0: actual time=20.627..199.852 rows=27779 loops=1
Buffers: shared hit=27133
Worker 1: actual time=21.065..199.786 rows=27805 loops=1
Buffers: shared hit=27159
Worker 2: actual time=21.445..198.582 rows=27330 loops=1
Buffers: shared hit=26706
Worker 3: actual time=21.470..195.915 rows=26880 loops=1
Buffers: shared hit=26245
-> Bitmap Index Scan on TAB4_idx1 (cost=0.00..25.95 rows=1784 width=0) (actual time=23.938..23.938 rows=138067 loops=1)
Index Cond: ((TAB4.TAB4_code)::text = 'XX'::text)
Buffers: shared hit=72
-> Index Only Scan using TAB1_pk on SCHEMA1.TAB1 TAB1 (cost=0.57..3.51 rows=1 width=37) (actual time=0.891..0.891 rows=1 loops=5428)
Output: TAB1.PRI
Index Cond: (TAB1.PRI = (TAB2.PRI)::text)
Heap Fetches: 0
Buffers: shared hit=18262 read=5549
I/O Timings: shared/local read=4749.434
Worker 0: actual time=0.899..0.899 rows=1 loops=1036
Buffers: shared hit=3464 read=1086
I/O Timings: shared/local read=913.854
Worker 1: actual time=0.894..0.894 rows=1 loops=1055
Buffers: shared hit=3558 read=1085
I/O Timings: shared/local read=925.646
Worker 2: actual time=0.901..0.901 rows=1 loops=1145
Buffers: shared hit=3840 read=1169
I/O Timings: shared/local read=1014.196
Worker 3: actual time=0.898..0.898 rows=1 loops=1078
Buffers: shared hit=3634 read=1102
I/O Timings: shared/local read=950.323
-> Index Only Scan using TAB3_pk on SCHEMA1.TAB3 a2 (cost=0.57..4.78 rows=1 width=36) (actual time=1.336..1.336 rows=1 loops=5428)
Output: a2.AID
Index Cond: (a2.AID = (TAB2.AID)::text)
Heap Fetches: 1836
Buffers: shared hit=18278 read=7398
I/O Timings: shared/local read=7172.664
Worker 0: actual time=1.393..1.393 rows=1 loops=1036
Buffers: shared hit=3455 read=1473
I/O Timings: shared/local read=1429.250
Worker 1: actual time=1.405..1.405 rows=1 loops=1055
Buffers: shared hit=3531 read=1476
I/O Timings: shared/local read=1464.637
Worker 2: actual time=1.296..1.296 rows=1 loops=1145
Buffers: shared hit=3857 read=1538
I/O Timings: shared/local read=1465.583
Worker 3: actual time=1.309..1.309 rows=1 loops=1078
Buffers: shared hit=3642 read=1459
I/O Timings: shared/local read=1395.946
Query Identifier: 7231829541130579109
Planning:
Buffers: shared hit=1414
Planning Time: 1.305 ms
Execution Time: 45224.792 ms
" Output: TAB1.PRI, TAB2.TAB2_time"
Buffers: shared hit=1980943 read=2335820
I/O Timings: shared/local read=112477.014
-> Sort (cost=13052924.01..13052924.19 rows=70 width=45) (actual time=45211.969..45224.713 rows=50 loops=1)
" Output: TAB1.PRI, TAB2.TAB2_time"
Sort Key: TAB2.TAB2_time DESC
Sort Method: top-N heapsort Memory: 32kB
Buffers: shared hit=1980943 read=2335820
I/O Timings: shared/local read=112477.014
-> Gather (cost=92917.38..13052921.87 rows=70 width=45) (actual time=947.004..45221.915 rows=5428 loops=1)
" Output: TAB1.PRI, TAB2.TAB2_time"
Workers Planned: 4
Workers Launched: 4
Buffers: shared hit=1980943 read=2335820
I/O Timings: shared/local read=112477.014
-> Nested Loop (cost=91917.38..13051914.87 rows=18 width=45) (actual time=945.946..45195.224 rows=1086 loops=5)
" Output: TAB1.PRI, TAB2.TAB2_time"
Inner Unique: true
Buffers: shared hit=1980943 read=2335820
I/O Timings: shared/local read=112477.014
Worker 0: actual time=936.808..45193.518 rows=1036 loops=1
Buffers: shared hit=382606 read=465076
I/O Timings: shared/local read=22452.028
Worker 1: actual time=947.246..45194.168 rows=1055 loops=1
Buffers: shared hit=383165 read=484189
I/O Timings: shared/local read=22617.135
Worker 2: actual time=933.623..45192.534 rows=1145 loops=1
Buffers: shared hit=415758 read=473182
I/O Timings: shared/local read=22741.488
Worker 3: actual time=965.639..45193.603 rows=1078 loops=1
Buffers: shared hit=398009 read=449053
I/O Timings: shared/local read=22221.094
-> Nested Loop (cost=91916.81..13051828.80 rows=18 width=81) (actual time=945.917..43729.931 rows=1086 loops=5)
" Output: TAB1.PRI, TAB2.TAB2_time, TAB2.AID"
Inner Unique: true
Join Filter: ((TAB4.PRI)::text = (TAB1.PRI)::text)
Buffers: shared hit=1962289 read=2328363
I/O Timings: shared/local read=105246.220
Worker 0: actual time=936.781..43732.652 rows=1036 loops=1
Buffers: shared hit=379077 read=463587
I/O Timings: shared/local read=21008.508
Worker 1: actual time=947.212..43699.507 rows=1055 loops=1
Buffers: shared hit=379573 read=482704
I/O Timings: shared/local read=21142.572
Worker 2: actual time=933.589..43696.710 rows=1145 loops=1
Buffers: shared hit=411836 read=471634
I/O Timings: shared/local read=21266.581
Worker 3: actual time=965.608..43768.535 rows=1078 loops=1
Buffers: shared hit=394288 read=447583
I/O Timings: shared/local read=20814.288
-> Parallel Hash Join (cost=91916.24..13051765.39 rows=18 width=117) (actual time=945.879..42758.939 rows=1086 loops=5)
" Output: TAB4.PRI, TAB2.TAB2_time, TAB2.PRI, TAB2.AID"
Hash Cond: ((TAB2.PRI)::text = (TAB4.PRI)::text)
Buffers: shared hit=1943792 read=2322814
I/O Timings: shared/local read=100496.787
Worker 0: actual time=936.743..42798.247 rows=1036 loops=1
Buffers: shared hit=375573 read=462501
I/O Timings: shared/local read=20094.654
Worker 1: actual time=947.169..42752.987 rows=1055 loops=1
Buffers: shared hit=375975 read=481619
I/O Timings: shared/local read=20216.926
Worker 2: actual time=933.545..42660.854 rows=1145 loops=1
Buffers: shared hit=407956 read=470465
I/O Timings: shared/local read=20252.386
Worker 3: actual time=965.567..42797.288 rows=1078 loops=1
Buffers: shared hit=390609 read=446481
I/O Timings: shared/local read=19863.965
" -> Parallel Bitmap Heap Scan on SCHEMA1.""TAB2"" TAB2 (cost=84860.50..13040301.00 rows=1175611 width=80) (actual time=713.054..26942.082 rows=956249 loops=5)"
" Output: TAB2.TAB2_time, TAB2.PRI, TAB2.AID"
Recheck Cond: (TAB2.MID = 'XXXXX'::numeric)
Rows Removed by Index Recheck: 2137395
Filter: ((TAB2.TAB2_time >= '2024-01-01 00:00:00+00'::timestamp with time zone) AND (TAB2.TAB2_time <= '2024-01-31 00:00:00+00'::timestamp with time zone))
Heap Blocks: exact=5300 lossy=782577
Buffers: shared hit=1651569 read=2245157
I/O Timings: shared/local read=29063.286
Worker 0: actual time=713.040..27006.980 rows=942051 loops=1
Buffers: shared hit=317611 read=447013
I/O Timings: shared/local read=5851.688
Worker 1: actual time=713.047..27065.878 rows=939696 loops=1
Buffers: shared hit=317632 read=466176
I/O Timings: shared/local read=6038.851
Worker 2: actual time=713.027..26894.506 rows=967468 loops=1
Buffers: shared hit=349596 read=454912
I/O Timings: shared/local read=5962.348
Worker 3: actual time=713.091..26826.767 rows=961928 loops=1
Buffers: shared hit=332980 read=430848
I/O Timings: shared/local read=5426.475
-> Bitmap Index Scan on TAB2_idx2 (cost=0.00..83684.89 rows=4702443 width=0) (actual time=688.661..688.661 rows=4781245 loops=1)
Index Cond: (TAB2.MID = 'XXXXX'::numeric)
Buffers: shared hit=12408
Worker 2: actual time=688.661..688.661 rows=4781245 loops=1
Buffers: shared hit=12408
-> Parallel Hash (cost=7042.63..7042.63 rows=1049 width=37) (actual time=217.987..217.988 rows=27613 loops=5)
Output: TAB4.PRI
Buckets: 262144 (originally 2048) Batches: 1 (originally 1) Memory Usage: 13936kB
Buffers: shared hit=134917
Worker 0: actual time=214.981..214.982 rows=27779 loops=1
Buffers: shared hit=27133
Worker 1: actual time=215.455..215.456 rows=27805 loops=1
Buffers: shared hit=27159
Worker 2: actual time=215.774..215.774 rows=27330 loops=1
Buffers: shared hit=26706
Worker 3: actual time=215.776..215.777 rows=26880 loops=1
Buffers: shared hit=26245
-> Parallel Bitmap Heap Scan on SCHEMA1.TAB4 TAB4 (cost=26.39..7042.63 rows=1049 width=37) (actual time=23.650..201.606 rows=27613 loops=5)
Output: TAB4.PRI
Recheck Cond: ((TAB4.TAB4_code)::text = 'XX'::text)
Rows Removed by Index Recheck: 616610
Heap Blocks: exact=11978 lossy=15624
Buffers: shared hit=134917
Worker 0: actual time=20.627..199.852 rows=27779 loops=1
Buffers: shared hit=27133
Worker 1: actual time=21.065..199.786 rows=27805 loops=1
Buffers: shared hit=27159
Worker 2: actual time=21.445..198.582 rows=27330 loops=1
Buffers: shared hit=26706
Worker 3: actual time=21.470..195.915 rows=26880 loops=1
Buffers: shared hit=26245
-> Bitmap Index Scan on TAB4_idx1 (cost=0.00..25.95 rows=1784 width=0) (actual time=23.938..23.938 rows=138067 loops=1)
Index Cond: ((TAB4.TAB4_code)::text = 'XX'::text)
Buffers: shared hit=72
-> Index Only Scan using TAB1_pk on SCHEMA1.TAB1 TAB1 (cost=0.57..3.51 rows=1 width=37) (actual time=0.891..0.891 rows=1 loops=5428)
Output: TAB1.PRI
Index Cond: (TAB1.PRI = (TAB2.PRI)::text)
Heap Fetches: 0
Buffers: shared hit=18262 read=5549
I/O Timings: shared/local read=4749.434
Worker 0: actual time=0.899..0.899 rows=1 loops=1036
Buffers: shared hit=3464 read=1086
I/O Timings: shared/local read=913.854
Worker 1: actual time=0.894..0.894 rows=1 loops=1055
Buffers: shared hit=3558 read=1085
I/O Timings: shared/local read=925.646
Worker 2: actual time=0.901..0.901 rows=1 loops=1145
Buffers: shared hit=3840 read=1169
I/O Timings: shared/local read=1014.196
Worker 3: actual time=0.898..0.898 rows=1 loops=1078
Buffers: shared hit=3634 read=1102
I/O Timings: shared/local read=950.323
-> Index Only Scan using TAB3_pk on SCHEMA1.TAB3 a2 (cost=0.57..4.78 rows=1 width=36) (actual time=1.336..1.336 rows=1 loops=5428)
Output: a2.AID
Index Cond: (a2.AID = (TAB2.AID)::text)
Heap Fetches: 1836
Buffers: shared hit=18278 read=7398
I/O Timings: shared/local read=7172.664
Worker 0: actual time=1.393..1.393 rows=1 loops=1036
Buffers: shared hit=3455 read=1473
I/O Timings: shared/local read=1429.250
Worker 1: actual time=1.405..1.405 rows=1 loops=1055
Buffers: shared hit=3531 read=1476
I/O Timings: shared/local read=1464.637
Worker 2: actual time=1.296..1.296 rows=1 loops=1145
Buffers: shared hit=3857 read=1538
I/O Timings: shared/local read=1465.583
Worker 3: actual time=1.309..1.309 rows=1 loops=1078
Buffers: shared hit=3642 read=1459
I/O Timings: shared/local read=1395.946
Query Identifier: 7231829541130579109
Planning:
Buffers: shared hit=1414
Planning Time: 1.305 ms
Execution Time: 45224.792 ms