Re: HIGH IO and Less CPU utilization

Rambabu g <rambabu.g2564@xxxxxxxxx> · Mon, 4 Apr 2022 13:17:59 +0530

Hi Justin,

I executed the same query first time it's takes 6+ sec, but if I run again same query that is taking 34 mill seconds, it's seems shared buffer reads are taking, but the second time shared buffer reads are not showing us,  so Please suggest me ig I need to change any  parameters to tune here.

postgres=# explain (analyze,buffers)   SELECT * FROM emp WHERE (empno='C3916271986');
                                                                  QUERY PLAN                                                                   
-----------------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using pd_activity_empi on pd_activity  (cost=0.57..23391.90 rows=7956 width=9202) (actual time=4.346..6442.761 rows=12771 loops=1)
   Index Cond: ((empno)::text = 'C3916271986'::text)
   Buffers: shared hit=598 read=12224
 Planning Time: 0.130 ms
 Execution Time: 6446.664 ms
(5 rows)

postgres=# explain (analyze,buffers)    SELECT * FROM emp WHERE (empno='C3916271986');
                                                                 QUERY PLAN                                                                  
---------------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using pd_activity_empi on pd_activity  (cost=0.57..23391.90 rows=7956 width=9202) (actual time=0.027..33.921 rows=12771 loops=1)
   Index Cond: ((empi)::text = 'C3916271986'::text)
   Buffers: shared hit=12822
 Planning Time: 0.138 ms
 Execution Time: 34.344 ms
(5 rows)

empno Changed :

postgres=# explain (analyze,buffers)  SELECT * FROM emp WHERE (empno='C6853372011');
                                                                 QUERY PLAN                                                                 
--------------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using pd_activity_empi on pd_activity  (cost=0.57..23391.90 rows=7956 width=9202) (actual time=2.764..430.357 rows=758 loops=1)
   Index Cond: ((empi)::text = 'C6853372011'::text)
   Buffers: shared hit=46 read=718
 Planning Time: 0.136 ms
 Execution Time: 430.617 ms
(5 rows)

Regards,
Rambabu.

On Thu, 31 Mar 2022 at 12:19, Justin Pryzby <pryzby@xxxxxxxxxxxxx> wrote:
On Wed, Mar 30, 2022 at 10:17:38AM +0530, Rambabu g wrote:

> Hi Justin,

> 

> Only one query is causing the issue, sharing the def of indexes. Please

> have a look.

> 

> > > There are three indexes defined on the table, each one is around 20 to 25GB

> 

>  tp    | character varying(2000)   | yes    | tp       | extended |

> 

>    852 | 00:09:56.131136 | IO              | DataFileRead  | explain

> analyze select distinct  empno  from emp where sname='test' and tp='EMP

> NAME 1'

The server is doing a scan of the large table.

The tp index matches a lot of rows (13e6) which probably aren't clustered, so

it elects to scan the 500GB table each time.

Looking at this in isolation, maybe it'd be enough to create an index on

tp,empno (and maybe drop the tp index).  CREATE INDEX CONCURRENTLY if you don't

want to disrupt other queries.

But This seems like something that should be solved in a better way though ;

like keeping a table with all the necessary "empno" maintained with "INSERT ON

CONFLICT DO NOTHING".  Or a trigger.