postgres=# explain (analyze,buffers) SELECT * FROM emp WHERE (empno='C3916271986');
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using pd_activity_empi on pd_activity (cost=0.57..23391.90 rows=7956 width=9202) (actual time=4.346..6442.761 rows=12771 loops=1)
Index Cond: ((empno)::text = 'C3916271986'::text)
Buffers: shared hit=598 read=12224
Planning Time: 0.130 ms
Execution Time: 6446.664 ms
(5 rows)
postgres=# explain (analyze,buffers) SELECT * FROM emp WHERE (empno='C3916271986');
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using pd_activity_empi on pd_activity (cost=0.57..23391.90 rows=7956 width=9202) (actual time=0.027..33.921 rows=12771 loops=1)
Index Cond: ((empi)::text = 'C3916271986'::text)
Buffers: shared hit=12822
Planning Time: 0.138 ms
Execution Time: 34.344 ms
(5 rows)
postgres=# explain (analyze,buffers) SELECT * FROM emp WHERE (empno='C6853372011');
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using pd_activity_empi on pd_activity (cost=0.57..23391.90 rows=7956 width=9202) (actual time=2.764..430.357 rows=758 loops=1)
Index Cond: ((empi)::text = 'C6853372011'::text)
Buffers: shared hit=46 read=718
Planning Time: 0.136 ms
Execution Time: 430.617 ms
(5 rows)
On Wed, Mar 30, 2022 at 10:17:38AM +0530, Rambabu g wrote:
> Hi Justin,
>
> Only one query is causing the issue, sharing the def of indexes. Please
> have a look.
>
> > > There are three indexes defined on the table, each one is around 20 to 25GB
>
> tp | character varying(2000) | yes | tp | extended |
>
> 852 | 00:09:56.131136 | IO | DataFileRead | explain
> analyze select distinct empno from emp where sname='test' and tp='EMP
> NAME 1'
The server is doing a scan of the large table.
The tp index matches a lot of rows (13e6) which probably aren't clustered, so
it elects to scan the 500GB table each time.
Looking at this in isolation, maybe it'd be enough to create an index on
tp,empno (and maybe drop the tp index). CREATE INDEX CONCURRENTLY if you don't
want to disrupt other queries.
But This seems like something that should be solved in a better way though ;
like keeping a table with all the necessary "empno" maintained with "INSERT ON
CONFLICT DO NOTHING". Or a trigger.