Table definition: Table "load.lm_queue" Column | Type | Collation | Nullable | Default ------------------------+--------------------------------+-----------+----------+------------------------- guid | character(33) | | not null | host_name | character varying(40) | | | priority | numeric(11,0) | | not null | request_time | timestamp(6) without time zone | | not null | collection_name | character varying(40) | | not null | stage_id | numeric(11,0) | | | source_file | character varying(250) | | | lm_id | numeric(11,0) | | | start_time | timestamp(6) without time zone | | | status_text | character varying(225) | | not null | NULL::character varying load_data_id | character varying(500) | | | docs_in_load | numeric(11,0) | | | client_name | character varying(50) | | | status_code | numeric(11,0) | | | email_address | character varying(2000) | | | hold_flag | character(1) | | | process_type | character varying(40) | | | cancel_flag | character(1) | | | file_type | character varying(6) | | | lm_data | character varying(4000) | | | ds_request_time | timestamp(6) without time zone | | | ds_id | numeric(11,0) | | | ds_start_time | timestamp(6) without time zone | | | auto_promote_flag | character(1) | | | extract_out_file | character varying(250) | | | last_upd_time | timestamp(6) without time zone | | | ds_fail_count | numeric(11,0) | | | cc_collection | character varying(40) | | | cc_environment | character varying(40) | | | cc_fail_on_db_mismatch | character(1) | | | cc_tracking_guid | character varying(33) | | | cc_numrows | character varying(50) | | | cc_owner | character varying(30) | | | cc_password | character varying(30) | | | parent_guid | character varying(33) | | | Indexes: "xpklm_queue" PRIMARY KEY, btree (guid) "idx_hn_cn_dsid_sc_dst_dfc" btree (host_name, collection_name, ds_id, status_code, ds_start_time, ds_fail_count) "ix_lm_cc" btree (collection_name, client_name) "ix_lm_chl" btree (client_name, host_name, lm_id) Query and plan: explain
SELECT GUID, COLLECTION_NAME, PRIORITY, START_TIME, REQUEST_TIME, CLIENT_NAME, PROCESS_TYPE, PARENT_GUID
FROM LOAD.LM_QUEUE lmq1
WHERE CLIENT_NAME='WLCASES'
AND HOLD_FLAG='Y'
AND HOST_NAME='WLLOADB'
AND STATUS_CODE in (1)
AND NOT EXISTS
(SELECT COLLECTION_NAME
FROM LOAD.LM_QUEUE lmq2
WHERE lmq1.COLLECTION_NAME = lmq2.COLLECTION_NAME
AND LM_ID <> 0
AND PROCESS_TYPE NOT IN('EXTRACT'))
OR (PROCESS_TYPE in ('UNLOCK','UNLOCK RERUN')
AND LM_ID = 0
AND CLIENT_NAME='WLCASES'
AND HOST_NAME= 'WLLOADB' )
ORDER BY PRIORITY DESC, REQUEST_TIME ASC; Sort (cost=1578.99..1579.00 rows=1 width=120) Sort Key: lmq1.priority DESC, lmq1.request_time -> Index Scan using ix_lm_chl on lm_queue lmq1 (cost=0.40..1578.98 rows=1 width=120) Index Cond: (((client_name)::text = 'WLCASES'::text) AND ((host_name)::text = 'WLLOADB'::text)) Filter: (((hold_flag = 'Y'::bpchar) AND (status_code = '1'::numeric) AND (NOT (alternatives: SubPlan 1 or hashed SubPlan 2))) OR (((process_type)::text
= ANY ('{UNLOCK,"UNLOCK RERUN"}'::text[])) AND (lm_id = '0'::numeric))) SubPlan 1 -> Index Scan using ix_lm_cc on lm_queue lmq2 (cost=0.40..177.93 rows=1 width=0) Index Cond: ((collection_name)::text = (lmq1.collection_name)::text) Filter: ((lm_id <> '0'::numeric) AND ((process_type)::text <> 'EXTRACT'::text)) SubPlan 2 -> Seq Scan on lm_queue lmq2_1 (cost=0.00..124999.06 rows=12 width=32) Filter: ((lm_id <> '0'::numeric) AND ((process_type)::text <> 'EXTRACT'::text)) Question on the plan above: I understand SubPlan 1 above- it is joining into the NOT EXISTS via the
lmq1.COLLECTION_NAME = lmq2.COLLECTION_NAME and then applying the other filtering inside the NOT EXISTS. But I don’t understand SubPlan 2. Given the filter conditions under SubPlan 2 it is also coming from the NOT EXISTS because that is where LM_ID <> 0 and
PROCESS_TYPE <> EXTRACT exist but I don’t understand the scenario where this query would need to use SubPlan 2. Would anyone be able to explain under what condition(s) SubPlan 2 would get executed? I’m trying to understand the precedence of AND/OR operations when everything is not tied together with ()’s. I see the following filtering in the query:
(SELECT COLLECTION_NAME
FROM LOAD.LM_QUEUE lmq2
WHERE lmq1.COLLECTION_NAME = lmq2.COLLECTION_NAME
AND LM_ID <> 0
AND PROCESS_TYPE NOT IN('EXTRACT'))
AND LM_ID = 0
AND CLIENT_NAME='WLCASES'
AND HOST_NAME= 'WLLOADB' )
All of the conditions in #6 above go together in that OR clause because of the (). According to this link: https://www.postgresql.org/docs/14/sql-syntax-lexical.html#SQL-PRECEDENCE AND has higher precedence over OR. So in my mind that would imply the order of the filters in the WHERE clause shouldn’t matter- if there are multiple AND filters and OR filters regardless of the order in the FROM clause the AND’s are
applied first. But if I change the order of the filters- moved
AND HOLD_FLAG='Y' AND HOST_NAME='WLLOADB' AND STATUS_CODE in (1) to the bottom of the query: explain
SELECT GUID, COLLECTION_NAME, PRIORITY, START_TIME, REQUEST_TIME, CLIENT_NAME, PROCESS_TYPE, PARENT_GUID
FROM LOAD.LM_QUEUE lmq1
WHERE CLIENT_NAME='WLCASES'
AND NOT EXISTS
(SELECT COLLECTION_NAME
FROM LOAD.LM_QUEUE lmq2
WHERE lmq1.COLLECTION_NAME = lmq2.COLLECTION_NAME
AND LM_ID <> 0
AND PROCESS_TYPE NOT IN('EXTRACT'))
OR (PROCESS_TYPE in ('UNLOCK','UNLOCK RERUN')
AND LM_ID = 0
AND CLIENT_NAME='WLCASES'
AND HOST_NAME= 'WLLOADB' )
AND HOLD_FLAG='Y'
AND HOST_NAME='WLLOADB'
AND STATUS_CODE in (1)
ORDER BY PRIORITY DESC, REQUEST_TIME ASC; The plan changes: Sort (cost=9382.94..9382.97 rows=12 width=169) Sort Key: lmq1.priority DESC, lmq1.request_time -> Bitmap Heap Scan on lm_queue lmq1 (cost=4572.59..9382.73 rows=12 width=169) Recheck Cond: ((client_name)::text = 'WLCASES'::text) Filter: ((NOT (alternatives: SubPlan 1 or hashed SubPlan 2)) OR (((process_type)::text = ANY ('{UNLOCK,"UNLOCK RERUN"}'::text[])) AND (lm_id =
'0'::numeric) AND ((host_name)::text = 'WLLOADB'::text) AND (hold_flag = 'Y'::bpchar) AND ((host_name)::text = 'WLLOADB'::text) AND (status_code = '1'::numeric))) -> Bitmap Index Scan on ix_lm_chl (cost=0.00..4572.58 rows=25 width=0) Index Cond: ((client_name)::text = 'WLCASES'::text) SubPlan 1 -> Bitmap Heap Scan on lm_queue lmq2 (cost=164.44..188.42 rows=1 width=0) Recheck Cond: ((lmq1.collection_name)::text = (collection_name)::text) Filter: ((lm_id <> '0'::numeric) AND ((process_type)::text <> 'EXTRACT'::text)) -> Bitmap Index Scan on ix_lm_cc (cost=0.00..164.44 rows=6 width=0) Index Cond: ((collection_name)::text = (lmq1.collection_name)::text) SubPlan 2 -> Seq Scan on lm_queue lmq2_1 (cost=0.00..124999.49 rows=25 width=32) Filter: ((lm_id <> '0'::numeric) AND ((process_type)::text <> 'EXTRACT'::text)) In the original plan above I believe the query drives off index ix_lm_chl applying both client_name = WLCASES and host_name = WLLOADB to the index cond. In the plan directly above I believe it also drives off index ix_lm_chl but it is
only applying client_name = WLCASES to the index cond. If AND’s are applied first then why wouldn’t the modified query apply both client_name = WLCASES and host_name = WLLOADB to the index cond? Is it because those are moved below the OR condition? Thanks in advance. Steve |