Postgres version: 11.4
Problem:
Query choosing Bad Index Path. Details are provided below
Table :
\d public.distdbentityauditlog1_46625_temp_mahi1;
Table "public.distdbentityauditlog1_46625_temp_mahi1"
Column | Type | Collation | Nullable | Default
------------------+-----------------------------+-----------+----------+---------
zgid | bigint | | not null |
auditlogid | bigint | | not null |
recordid | bigint | | |
recordname | text | | |
module | character varying(50) | | not null |
actioninfo | character varying(255) | | not null |
relatedid | bigint | | |
relatedname | character varying(255) | | |
relatedmodule | character varying(50) | | |
accountid | bigint | | |
accountname | character varying(255) | | |
doneby | character varying(255) | | not null |
userid | bigint | | |
auditedtime | timestamp without time zone | | not null |
fieldhistoryinfo | text | | |
isauditlogdata | boolean | | not null |
otherdetails | text | | |
audittype | integer | | not null |
requesteruserid | bigint | | |
actiontype | integer | | not null |
source | integer | | not null |
module_lower | character varying(50) | | not null |
Indexes:
"distdbentityauditlog1_46625_temp_mahi1_pkey" PRIMARY KEY, btree (zgid, auditedtime, auditlogid)
"distdbentityauditlog1_46625_temp_mahi1_actioninfo_idx" gist (actioninfo gist_trgm_ops)
"distdbentityauditlog1_46625_temp_mahi1_actioninfo_idx1" btree (actioninfo)
"distdbentityauditlog1_46625_temp_mahi1_auditedtime_module_idx" btree (auditedtime DESC, module)
"distdbentityauditlog1_46625_temp_mahi1_auditlogid_idx" btree (auditlogid DESC)
"distdbentityauditlog1_46625_temp_mahi1_idx5" btree (module)
"distdbentityauditlog1_46625_temp_mahi1_idx6" btree (recordid, auditedtime DESC)
"distdbentityauditlog1_46625_temp_mahi1_idx7" btree (relatedid, auditedtime DESC)
explain (analyse, buffers, verbose) SELECT zgid, auditlogid, recordid, recordname, module, actioninfo, relatedid, relatedname, relatedmodule, accountid, accountname, doneby, userid, auditedtime, fieldhistoryinfo, isauditlogdata, otherdetails, audittype, requesteruserid, actiontype, source FROM public.distdbentityauditlog1_46625_temp_mahi1 distdbentityauditlog1 WHERE ((actiontype = ANY ('{2,9,14,55,56,67}'::integer[])) AND ((recordid = '15842006928391817'::bigint) OR ((module)::text = 'Contacts'::text)) AND ((recordid = '15842006928391817'::bigint) OR (relatedid = '15842006928391817'::bigint)) AND (audittype <> ALL ('{2,4,5,6}'::integer[])) AND (auditedtime >= '2021-03-27 09:43:17'::timestamp without time zone) AND (zgid = 100)) ORDER BY 14 DESC, 2 DESC LIMIT '10'::bigint;
Doubt
1. Why is this Query choosing Index Scan Backward using table1_pkey Index though it's cost is high. It can rather choose
BITMAP OR
(Index on RECORDID) i.e; table1_idx6
(Index on RELATEDID) i.e; table1_idx7
Below is the selectivity details from pg_stats table
- Recordid has 51969 distinct values. And selectivity (most_common_freqs) for recordid = 15842006928391817 is 0.00376667
- Relatedid has 82128 distinct values. And selectivity (most_common_freqs) for recordid = 15842006928391817 is 0.0050666
Since, selectivity is less, this should logically choose this Index, which would have improve my query performance here.
I cross-checked the same by removing PrimaryKey to this table and query now chooses these indexes and response is in 100ms. Please refer the plan below (after removing primary key):
alter table public.distdbentityauditlog1_46625_temp_mahi1 drop constraint distdbentityauditlog1_46625_temp_mahi1_pkey;
analyse public.distdbentityauditlog1_46625_temp_mahi1;
explain (analyse, buffers, verbose) SELECT zgid, auditlogid, recordid, recordname, module, actioninfo, relatedid, relatedname, relatedmodule, accountid, accountname, doneby, userid, auditedtime, fieldhistoryinfo, isauditlogdata, otherdetails, audittype, requesteruserid, actiontype, source FROM public.distdbentityauditlog1_46625_temp_mahi1 distdbentityauditlog1 WHERE ((actiontype = ANY ('{2,9,14,55,56,67}'::integer[])) AND ((recordid = '15842006928391817'::bigint) OR ((module)::text = 'Contacts'::text)) AND ((recordid = '15842006928391817'::bigint) OR (relatedid = '15842006928391817'::bigint)) AND (audittype <> ALL ('{2,4,5,6}'::integer[])) AND (auditedtime >= '2021-03-27 09:43:17'::timestamp without time zone) AND (zgid = 100)) ORDER BY 14 DESC, 2 DESC LIMIT '10'::bigint;
Problem:
Query choosing Bad Index Path. Details are provided below
Table :
\d public.distdbentityauditlog1_46625_temp_mahi1;
Table "public.distdbentityauditlog1_46625_temp_mahi1"
Column | Type | Collation | Nullable | Default
------------------+-----------------------------+-----------+----------+---------
zgid | bigint | | not null |
auditlogid | bigint | | not null |
recordid | bigint | | |
recordname | text | | |
module | character varying(50) | | not null |
actioninfo | character varying(255) | | not null |
relatedid | bigint | | |
relatedname | character varying(255) | | |
relatedmodule | character varying(50) | | |
accountid | bigint | | |
accountname | character varying(255) | | |
doneby | character varying(255) | | not null |
userid | bigint | | |
auditedtime | timestamp without time zone | | not null |
fieldhistoryinfo | text | | |
isauditlogdata | boolean | | not null |
otherdetails | text | | |
audittype | integer | | not null |
requesteruserid | bigint | | |
actiontype | integer | | not null |
source | integer | | not null |
module_lower | character varying(50) | | not null |
Indexes:
"distdbentityauditlog1_46625_temp_mahi1_pkey" PRIMARY KEY, btree (zgid, auditedtime, auditlogid)
"distdbentityauditlog1_46625_temp_mahi1_actioninfo_idx" gist (actioninfo gist_trgm_ops)
"distdbentityauditlog1_46625_temp_mahi1_actioninfo_idx1" btree (actioninfo)
"distdbentityauditlog1_46625_temp_mahi1_auditedtime_module_idx" btree (auditedtime DESC, module)
"distdbentityauditlog1_46625_temp_mahi1_auditlogid_idx" btree (auditlogid DESC)
"distdbentityauditlog1_46625_temp_mahi1_idx5" btree (module)
"distdbentityauditlog1_46625_temp_mahi1_idx6" btree (recordid, auditedtime DESC)
"distdbentityauditlog1_46625_temp_mahi1_idx7" btree (relatedid, auditedtime DESC)
explain (analyse, buffers, verbose) SELECT zgid, auditlogid, recordid, recordname, module, actioninfo, relatedid, relatedname, relatedmodule, accountid, accountname, doneby, userid, auditedtime, fieldhistoryinfo, isauditlogdata, otherdetails, audittype, requesteruserid, actiontype, source FROM public.distdbentityauditlog1_46625_temp_mahi1 distdbentityauditlog1 WHERE ((actiontype = ANY ('{2,9,14,55,56,67}'::integer[])) AND ((recordid = '15842006928391817'::bigint) OR ((module)::text = 'Contacts'::text)) AND ((recordid = '15842006928391817'::bigint) OR (relatedid = '15842006928391817'::bigint)) AND (audittype <> ALL ('{2,4,5,6}'::integer[])) AND (auditedtime >= '2021-03-27 09:43:17'::timestamp without time zone) AND (zgid = 100)) ORDER BY 14 DESC, 2 DESC LIMIT '10'::bigint;
Limit (cost=0.43..438.62 rows=10 width=400) (actual time=8045.030..8045.576 rows=10 loops=1)
Output: zgid, auditlogid, recordid, recordname, module, actioninfo, relatedid, relatedname, relatedmodule, accountid, accountname, doneby, userid, auditedtime, fieldhistoryinfo, isauditlogdata, otherdetails, audittype, requesteru
serid, actiontype, source
Buffers: shared hit=548660 read=1485553
-> Index Scan Backward using distdbentityauditlog1_46625_temp_mahi1_pkey on public.distdbentityauditlog1_46625_temp_mahi1 distdbentityauditlog1 (cost=0.43..445948.91 rows=10177 width=400) (actual time=8045.027..8045.569 rows=10
loops=1)
Output: zgid, auditlogid, recordid, recordname, module, actioninfo, relatedid, relatedname, relatedmodule, accountid, accountname, doneby, userid, auditedtime, fieldhistoryinfo, isauditlogdata, otherdetails, audittype, requ
esteruserid, actiontype, source
Index Cond: ((distdbentityauditlog1.zgid = 100) AND (distdbentityauditlog1.auditedtime >= '2021-03-27 09:43:17'::timestamp without time zone))
Filter: (((distdbentityauditlog1.recordid = '15842006928391817'::bigint) OR ((distdbentityauditlog1.module)::text = 'Contacts'::text)) AND ((distdbentityauditlog1.recordid = '15842006928391817'::bigint) OR (distdbentityaudi
tlog1.relatedid = '15842006928391817'::bigint)) AND (distdbentityauditlog1.audittype <> ALL ('{2,4,5,6}'::integer[])) AND (distdbentityauditlog1.actiontype = ANY ('{2,9,14,55,56,67}'::integer[])))
Rows Removed by Filter: 2943989
Buffers: shared hit=548660 read=1485553
Planning Time: 0.530 ms
Execution Time: 8045.687 ms
Doubt
1. Why is this Query choosing Index Scan Backward using table1_pkey Index though it's cost is high. It can rather choose
BITMAP OR
(Index on RECORDID) i.e; table1_idx6
(Index on RELATEDID) i.e; table1_idx7
Below is the selectivity details from pg_stats table
- Recordid has 51969 distinct values. And selectivity (most_common_freqs) for recordid = 15842006928391817 is 0.00376667
- Relatedid has 82128 distinct values. And selectivity (most_common_freqs) for recordid = 15842006928391817 is 0.0050666
Since, selectivity is less, this should logically choose this Index, which would have improve my query performance here.
I cross-checked the same by removing PrimaryKey to this table and query now chooses these indexes and response is in 100ms. Please refer the plan below (after removing primary key):
alter table public.distdbentityauditlog1_46625_temp_mahi1 drop constraint distdbentityauditlog1_46625_temp_mahi1_pkey;
analyse public.distdbentityauditlog1_46625_temp_mahi1;
explain (analyse, buffers, verbose) SELECT zgid, auditlogid, recordid, recordname, module, actioninfo, relatedid, relatedname, relatedmodule, accountid, accountname, doneby, userid, auditedtime, fieldhistoryinfo, isauditlogdata, otherdetails, audittype, requesteruserid, actiontype, source FROM public.distdbentityauditlog1_46625_temp_mahi1 distdbentityauditlog1 WHERE ((actiontype = ANY ('{2,9,14,55,56,67}'::integer[])) AND ((recordid = '15842006928391817'::bigint) OR ((module)::text = 'Contacts'::text)) AND ((recordid = '15842006928391817'::bigint) OR (relatedid = '15842006928391817'::bigint)) AND (audittype <> ALL ('{2,4,5,6}'::integer[])) AND (auditedtime >= '2021-03-27 09:43:17'::timestamp without time zone) AND (zgid = 100)) ORDER BY 14 DESC, 2 DESC LIMIT '10'::bigint;
Limit (cost=140917.99..140918.01 rows=10 width=402) (actual time=103.667..103.673 rows=10 loops=1)
Output: zgid, auditlogid, recordid, recordname, module, actioninfo, relatedid, relatedname, relatedmodule, accountid, accountname, doneby, userid, auditedtime, fieldhistoryinfo, isauditlogdata, otherdetails, audittype, requesteru
serid, actiontype, source
Buffers: shared read=10448 written=9
-> Sort (cost=140917.99..140942.38 rows=9759 width=402) (actual time=103.665..103.667 rows=10 loops=1)
Output: zgid, auditlogid, recordid, recordname, module, actioninfo, relatedid, relatedname, relatedmodule, accountid, accountname, doneby, userid, auditedtime, fieldhistoryinfo, isauditlogdata, otherdetails, audittype, requ
esteruserid, actiontype, source
Sort Key: distdbentityauditlog1.auditedtime DESC, distdbentityauditlog1.auditlogid DESC
Sort Method: top-N heapsort Memory: 34kB
Buffers: shared read=10448 written=9
-> Bitmap Heap Scan on public.distdbentityauditlog1_46625_temp_mahi1 distdbentityauditlog1 (cost=686.74..140707.10 rows=9759 width=402) (actual time=12.291..79.847 rows=16824 loops=1)
Output: zgid, auditlogid, recordid, recordname, module, actioninfo, relatedid, relatedname, relatedmodule, accountid, accountname, doneby, userid, auditedtime, fieldhistoryinfo, isauditlogdata, otherdetails, audittype
, requesteruserid, actiontype, source
Recheck Cond: (((distdbentityauditlog1.recordid = '15842006928391817'::bigint) AND (distdbentityauditlog1.auditedtime >= '2021-03-27 09:43:17'::timestamp without time zone)) OR ((distdbentityauditlog1.relatedid = '158
42006928391817'::bigint) AND (distdbentityauditlog1.auditedtime >= '2021-03-27 09:43:17'::timestamp without time zone)))
Filter: ((distdbentityauditlog1.zgid = 100) AND ((distdbentityauditlog1.recordid = '15842006928391817'::bigint) OR ((distdbentityauditlog1.module)::text = 'Contacts'::text)) AND (distdbentityauditlog1.audittype <>
ALL ('{2,4,5,6}'::integer[])) AND (distdbentityauditlog1.actiontype = ANY ('{2,9,14,55,56,67}'::integer[])))
Heap Blocks: exact=10267
Buffers: shared read=10448 written=9
-> BitmapOr (cost=686.74..686.74 rows=32499 width=0) (actual time=9.464..9.464 rows=0 loops=1)
Buffers: shared read=181
-> Bitmap Index Scan on distdbentityauditlog1_46625_temp_mahi1_idx6 (cost=0.00..348.93 rows=16250 width=0) (actual time=5.812..5.812 rows=16928 loops=1)
Index Cond: ((distdbentityauditlog1.recordid = '15842006928391817'::bigint) AND (distdbentityauditlog1.auditedtime >= '2021-03-27 09:43:17'::timestamp without time zone))
Buffers: shared read=95
-> Bitmap Index Scan on distdbentityauditlog1_46625_temp_mahi1_idx7 (cost=0.00..332.93 rows=16250 width=0) (actual time=3.650..3.650 rows=16824 loops=1)
Index Cond: ((distdbentityauditlog1.relatedid = '15842006928391817'::bigint) AND (distdbentityauditlog1.auditedtime >= '2021-03-27 09:43:17'::timestamp without time zone))
Buffers: shared read=86
Planning Time: 1.110 ms
Execution Time: 103.755 ms