Good afternoon,
--
We are currently running postgres version 12.2. I have a query that uses regexp code to confirm if a text field is a date or not. It is resource intensive and long running. I have made a regexp index (actually several flavors of regexp indexes) to aid in this queries performance. However I cannot seem to 'convince' the planner to use the indexes I create.
Following is table/index construction for your reference:
\d customertab
Table "public.customertab"
Column | Type | Collation | Nullable | Default
---------------------------+-----------------------------+-----------+----------+-------------------------------------------------
id | bigint | | not null | nextval('customer_attributes_id_seq'::regclass)
customer_attribute_key_id | integer | | |
attributable_id | integer | | |
attributable_type | character varying(255) | | |
value | character varying(255) | | |
created_at | timestamp without time zone | | |
updated_at | timestamp without time zone | | |
group_num | character varying(255) | | |
customer_transaction_id | integer | | |
account_id | integer | | |
Indexes:
"primets_custattr_pkey" PRIMARY KEY, btree (id)
"custattr_account_id" btree (account_id)
"custattr_value" btree (value) WHERE value::text ~ '^([0-9][0-9][0-9][0-9])-(([0]?[1-9])|([1][0-2]))-(([0-2]?[0-9])|([3][0-1]))$'::text AND value::text ~ '([0-9][0-9][0-9][0-9])-(0[13578]|1[02])-(0[1-9]|[12][0-9]|3[01])|([0-9][0-9][0-9][0-9])-(0[469]|11])-(0[1-9]|[12][0-9]|30)|([0-9][0-9][0248][048]|[0-9][0-9][13579][26])-(02)-(0[1-9]|1[0-9]|2[0-9])|([0-9][0-9][0248][1235679]|[0-9][0-9][13579][01345789])-(02)-(0[1-9]|1[0-9]|2[0-8])'::text AND account_id IS NOT NULL
"custattr_value_acctid_keyid" btree (value, account_id, customer_attribute_key_id) WHERE value::text ~ '^([0-9][0-9][0-9][0-9])-(([0]?[1-9])|([1][0-2]))-(([0-2]?[0-9])|([3][0-1]))$'::text AND value::text ~ '([0-9][0-9][0-9][0-9])-(0[13578]|1[02])-(0[1-9]|[12][0-9]|3[01])|([0-9][0-9][0-9][0-9])-(0[469]|11])-(0[1-9]|[12][0-9]|30)|([0-9][0-9][0248][048]|[0-9][0-9][13579][26])-(02)-(0[1-9]|1[0-9]|2[0-9])|([0-9][0-9][0248][1235679]|[0-9][0-9][13579][01345789])-(02)-(0[1-9]|1[0-9]|2[0-8])'::text AND account_id IS NOT NULL
Table "public.customertab"
Column | Type | Collation | Nullable | Default
---------------------------+-----------------------------+-----------+----------+-------------------------------------------------
id | bigint | | not null | nextval('customer_attributes_id_seq'::regclass)
customer_attribute_key_id | integer | | |
attributable_id | integer | | |
attributable_type | character varying(255) | | |
value | character varying(255) | | |
created_at | timestamp without time zone | | |
updated_at | timestamp without time zone | | |
group_num | character varying(255) | | |
customer_transaction_id | integer | | |
account_id | integer | | |
Indexes:
"primets_custattr_pkey" PRIMARY KEY, btree (id)
"custattr_account_id" btree (account_id)
"custattr_value" btree (value) WHERE value::text ~ '^([0-9][0-9][0-9][0-9])-(([0]?[1-9])|([1][0-2]))-(([0-2]?[0-9])|([3][0-1]))$'::text AND value::text ~ '([0-9][0-9][0-9][0-9])-(0[13578]|1[02])-(0[1-9]|[12][0-9]|3[01])|([0-9][0-9][0-9][0-9])-(0[469]|11])-(0[1-9]|[12][0-9]|30)|([0-9][0-9][0248][048]|[0-9][0-9][13579][26])-(02)-(0[1-9]|1[0-9]|2[0-9])|([0-9][0-9][0248][1235679]|[0-9][0-9][13579][01345789])-(02)-(0[1-9]|1[0-9]|2[0-8])'::text AND account_id IS NOT NULL
"custattr_value_acctid_keyid" btree (value, account_id, customer_attribute_key_id) WHERE value::text ~ '^([0-9][0-9][0-9][0-9])-(([0]?[1-9])|([1][0-2]))-(([0-2]?[0-9])|([3][0-1]))$'::text AND value::text ~ '([0-9][0-9][0-9][0-9])-(0[13578]|1[02])-(0[1-9]|[12][0-9]|3[01])|([0-9][0-9][0-9][0-9])-(0[469]|11])-(0[1-9]|[12][0-9]|30)|([0-9][0-9][0248][048]|[0-9][0-9][13579][26])-(02)-(0[1-9]|1[0-9]|2[0-9])|([0-9][0-9][0248][1235679]|[0-9][0-9][13579][01345789])-(02)-(0[1-9]|1[0-9]|2[0-8])'::text AND account_id IS NOT NULL
"custattr_value_includes" btree (value) INCLUDE (id, account_id, group_num) WHERE value::text ~ '^([0-9][0-9][0-9][0-9])-(([0]?[1-9])|([1][0-2]))-(([0-2]?[0-9])|([3][0-1]))$'::text AND value::text ~ '([0-9][0-9][0-9][0-9])-(0[13578]|1[02])-(0[1-9]|[12][0-9]|3[01])|([0-9][0-9][0-9][0-9])-(0[469]|11])-(0[1-9]|[12][0-9]|30)|([0-9][0-9][0248][048]|[0-9][0-9][13579][26])-(02)-(0[1-9]|1[0-9]|2[0-9])|([0-9][0-9][0248][1235679]|[0-9][0-9][13579][01345789])-(02)-(0[1-9]|1[0-9]|2[0-8])'::text AND account_id IS NOT NULL
The query I am attempting to run:
explain select * FROM customertab ca
where ca.value::text ~ '^([0-9][0-9][0-9][0-9])-(([0]?[1-9])|([1][0-2]))-(([0-2]?[0-9])|([3][0-1]))$'::text
AND ca.value::text ~ '([0-9][0-9][0-9][0-9])-(0[13578]|1[02])-(0[1-9]|[12][0-9]|3[01])|([0-9][0-9][0-9][0-9])-(0[469]|11])-(0[1-9]|[12][0-9]|30)|([0-9][0-9][0248][048]|[0-9][0-9][13579][26])-(02)-(0[1-9]|1[0-9]|2[0-9])|([0-9][0-9][0248][1235679]|[0-9][0-9][13579][01345789])-(02)-(0[1-9]|1[0-9]|2[0-8])'::text
AND ca.account_id IS NOT NULL
where ca.value::text ~ '^([0-9][0-9][0-9][0-9])-(([0]?[1-9])|([1][0-2]))-(([0-2]?[0-9])|([3][0-1]))$'::text
AND ca.value::text ~ '([0-9][0-9][0-9][0-9])-(0[13578]|1[02])-(0[1-9]|[12][0-9]|3[01])|([0-9][0-9][0-9][0-9])-(0[469]|11])-(0[1-9]|[12][0-9]|30)|([0-9][0-9][0248][048]|[0-9][0-9][13579][26])-(02)-(0[1-9]|1[0-9]|2[0-9])|([0-9][0-9][0248][1235679]|[0-9][0-9][13579][01345789])-(02)-(0[1-9]|1[0-9]|2[0-8])'::text
AND ca.account_id IS NOT NULL
and the explain:
Bitmap Heap Scan on customertab ca (cost=9907.40..903793.74 rows=479041 width=95)
Recheck Cond: (((value)::text ~ '^([0-9][0-9][0-9][0-9])-(([0]?[1-9])|([1][0-2]))-(([0-2]?[0-9])|([3][0-1]))$'::text) AND ((value)::text ~ '([0-9][0-9][0-9][0-9])-(0[13578]|1[02])-(0[1-9]|[12][0-9]|3[01])|([0-9][0-9][0
-9][0-9])-(0[469]|11])-(0[1-9]|[12][0-9]|30)|([0-9][0-9][0248][048]|[0-9][0-9][13579][26])-(02)-(0[1-9]|1[0-9]|2[0-9])|([0-9][0-9][0248][1235679]|[0-9][0-9][13579][01345789])-(02)-(0[1-9]|1[0-9]|2[0-8])'::text) AND (accou
nt_id IS NOT NULL))
-> Bitmap Index Scan on custattr_value (cost=0.00..9787.64 rows=479041 width=0)
(3 rows)
Recheck Cond: (((value)::text ~ '^([0-9][0-9][0-9][0-9])-(([0]?[1-9])|([1][0-2]))-(([0-2]?[0-9])|([3][0-1]))$'::text) AND ((value)::text ~ '([0-9][0-9][0-9][0-9])-(0[13578]|1[02])-(0[1-9]|[12][0-9]|3[01])|([0-9][0-9][0
-9][0-9])-(0[469]|11])-(0[1-9]|[12][0-9]|30)|([0-9][0-9][0248][048]|[0-9][0-9][13579][26])-(02)-(0[1-9]|1[0-9]|2[0-9])|([0-9][0-9][0248][1235679]|[0-9][0-9][13579][01345789])-(02)-(0[1-9]|1[0-9]|2[0-8])'::text) AND (accou
nt_id IS NOT NULL))
-> Bitmap Index Scan on custattr_value (cost=0.00..9787.64 rows=479041 width=0)
(3 rows)
Finally if I remove the 'ca.account_id is not null' _expression_
explain select * FROM customertab ca
where ca.value::text ~ '^([0-9][0-9][0-9][0-9])-(([0]?[1-9])|([1][0-2]))-(([0-2]?[0-9])|([3][0-1]))$'::text
AND ca.value::text ~ '([0-9][0-9][0-9][0-9])-(0[13578]|1[02])-(0[1-9]|[12][0-9]|3[01])|([0-9][0-9][0-9][0-9])-(0[469]|11])-(0[1-9]|[12][0-9]|30)|([0-9][0-9][0248][048]|[0-9][0-9][13579][26])-(02)-(0[1-9]|1[0-9]|2[0-9])|([0-9][0-9][0248][1235679]|[0-9][0-9][13579][01345789])-(02)-(0[1-9]|1[0-9]|2[0-8])'::text
where ca.value::text ~ '^([0-9][0-9][0-9][0-9])-(([0]?[1-9])|([1][0-2]))-(([0-2]?[0-9])|([3][0-1]))$'::text
AND ca.value::text ~ '([0-9][0-9][0-9][0-9])-(0[13578]|1[02])-(0[1-9]|[12][0-9]|3[01])|([0-9][0-9][0-9][0-9])-(0[469]|11])-(0[1-9]|[12][0-9]|30)|([0-9][0-9][0248][048]|[0-9][0-9][13579][26])-(02)-(0[1-9]|1[0-9]|2[0-9])|([0-9][0-9][0248][1235679]|[0-9][0-9][13579][01345789])-(02)-(0[1-9]|1[0-9]|2[0-8])'::text
This is the explain:
Gather (cost=1000.00..1641792.30 rows=479041 width=95)
Workers Planned: 2
-> Parallel Seq Scan on customertab ca (cost=0.00..1592888.20 rows=199600 width=95)
Filter: (((value)::text ~ '^([0-9][0-9][0-9][0-9])-(([0]?[1-9])|([1][0-2]))-(([0-2]?[0-9])|([3][0-1]))$'::text) AND ((value)::text ~ '([0-9][0-9][0-9][0-9])-(0[13578]|1[02])-(0[1-9]|[12][0-9]|3[01])|([0-9][0-9][0
-9][0-9])-(0[469]|11])-(0[1-9]|[12][0-9]|30)|([0-9][0-9][0248][048]|[0-9][0-9][13579][26])-(02)-(0[1-9]|1[0-9]|2[0-9])|([0-9][0-9][0248][1235679]|[0-9][0-9][13579][01345789])-(02)-(0[1-9]|1[0-9]|2[0-8])'::text))
(4 rows)
Workers Planned: 2
-> Parallel Seq Scan on customertab ca (cost=0.00..1592888.20 rows=199600 width=95)
Filter: (((value)::text ~ '^([0-9][0-9][0-9][0-9])-(([0]?[1-9])|([1][0-2]))-(([0-2]?[0-9])|([3][0-1]))$'::text) AND ((value)::text ~ '([0-9][0-9][0-9][0-9])-(0[13578]|1[02])-(0[1-9]|[12][0-9]|3[01])|([0-9][0-9][0
-9][0-9])-(0[469]|11])-(0[1-9]|[12][0-9]|30)|([0-9][0-9][0248][048]|[0-9][0-9][13579][26])-(02)-(0[1-9]|1[0-9]|2[0-9])|([0-9][0-9][0248][1235679]|[0-9][0-9][13579][01345789])-(02)-(0[1-9]|1[0-9]|2[0-8])'::text))
(4 rows)
Is there something else I can try, or am I missing something? Comments welcome.
Mark Steben
Database Administrator
@utoRevenue | Autobase
CRM division of Dominion Dealer Solutions
95D Ashley Ave.
West Springfield, MA 01089
t: 413.327-3045
f: 413.383-9567
Database Administrator
@utoRevenue | Autobase
CRM division of Dominion Dealer Solutions
95D Ashley Ave.
West Springfield, MA 01089
t: 413.327-3045
f: 413.383-9567
www.fb.com/DominionDealerSolutions
www.twitter.com/DominionDealer
www.drivedominion.com