Hi,
I have a query performance issue, it takes a long time, and not even getting explain analyze the output. this query joining on 3 tables which have around
a - 176223509
b - 286887780
c - 214219514
explain
select Count(a."individual_entity_proxy_id")
from "prospect" a
inner join "individual_demographic" b
on a."individual_entity_proxy_id" = b."individual_entity_proxy_id"
inner join "household_demographic" c
on a."household_entity_proxy_id" = c."household_entity_proxy_id"
where (((a."last_contacted_anychannel_dttm" is null)
or (a."last_contacted_anychannel_dttm" < TIMESTAMP '2020-11-23 0:00:00.000000'))
and (a."shared_paddr_with_customer_ind" = 'N')
and (a."profane_wrd_ind" = 'N')
and (a."tmo_ofnsv_name_ind" = 'N')
and (a."has_individual_address" = 'Y')
and (a."has_last_name" = 'Y')
and (a."has_first_name" = 'Y'))
and ((b."tax_bnkrpt_dcsd_ind" = 'N')
and (b."govt_prison_ind" = 'N')
and (b."cstmr_prspct_ind" = 'Prospect'))
and (( c."hspnc_lang_prfrnc_cval" in ('B', 'E', 'X') )
or (c."hspnc_lang_prfrnc_cval" is null));
-- Explain output
"Finalize Aggregate (cost=32813309.28..32813309.29 rows=1 width=8)"
" -> Gather (cost=32813308.45..32813309.26 rows=8 width=8)"
" Workers Planned: 8"
" -> Partial Aggregate (cost=32812308.45..32812308.46 rows=1 width=8)"
" -> Merge Join (cost=23870130.00..32759932.46 rows=20950395 width=8)"
" Merge Cond: (a.individual_entity_proxy_id = b.individual_entity_proxy_id)"
" -> Sort (cost=23870127.96..23922503.94 rows=20950395 width=8)"
" Sort Key: a.individual_entity_proxy_id"
" -> Hash Join (cost=13533600.42..21322510.26 rows=20950395 width=8)"
" Hash Cond: (a.household_entity_proxy_id = c.household_entity_proxy_id)"
" -> Parallel Seq Scan on prospect a (cost=0.00..6863735.60 rows=22171902 width=16)"
" Filter: (((last_contacted_anychannel_dttm IS NULL) OR (last_contacted_anychannel_dttm < '2020-11-23 00:00:00'::timestamp without time zone)) AND (shared_paddr_with_customer_ind = 'N'::bpchar) AND (profane_wrd_ind = 'N'::bpchar) AND (tmo_ofnsv_name_ind = 'N'::bpchar) AND (has_individual_address = 'Y'::bpchar) AND (has_last_name = 'Y'::bpchar) AND (has_first_name = 'Y'::bpchar))"
" -> Hash (cost=10801715.18..10801715.18 rows=166514899 width=8)"
" -> Seq Scan on household_demographic c (cost=0.00..10801715.18 rows=166514899 width=8)"
" Filter: (((hspnc_lang_prfrnc_cval)::text = ANY ('{B,E,X}'::text[])) OR (hspnc_lang_prfrnc_cval IS NULL))"
" -> Index Only Scan using indx_individual_demographic_prxyid_taxind_prspctind_prsnind on individual_demographic b (cost=0.57..8019347.13 rows=286887776 width=8)"
" Index Cond: ((tax_bnkrpt_dcsd_ind = 'N'::bpchar) AND (cstmr_prspct_ind = 'Prospect'::text) AND (govt_prison_ind = 'N'::bpchar))"
Tables ddl are attached in dbfiddle -- Postgres 11 | db<>fiddle
Server configuration is:
Version: 10.11
RAM - 320GB
vCPU - 32
"maintenance_work_mem" 256MB
"work_mem" 1GB
"shared_buffers" 64GB
Any suggestions?
Thanks,
Rj