Hello all. I'm migrating a database from PG 9.6 to 10.3 and have noticed a particular query that is performing very badly compared to its 9.6 counterpart.
The plan on 9.6 v 10.3 are effectively identical except in 9.6 the planner decides to use an index only scan on the primary key and in 10.3 it does a sequential scan. The problem is the sequential scan is for a table of 75M rows and 25 columns so its quiet a lot of pages it has to traverse.
This is the query:
explain verbose
WITH removed AS (
DELETE FROM match m
WHERE
NOT EXISTS (
SELECT 1
FROM build.household h -- This is the table that has 70M rows and does a full table scan in 10.3
WHERE h.household_id = m.household_id
) OR (
m.property_id IS NOT NULL AND
NOT EXISTS (
SELECT 1
FROM build.property p
WHERE p.household_id = m.household_id AND p.property_id = m.property_id
)
)
RETURNING *
)
INSERT INTO orphaned_matches (household_id, account_id, candidate_id, matched_at, full_name, first_name, last_name, match_reason, property_id, owner_id)
SELECT
removed.household_id,
removed.account_id,
removed.candidate_id,
removed.created_at,
removed.full_name,
removed.first_name,
removed.last_name,
removed.match_reason,
removed.property_id,
removed.owner_id
FROM removed;
What's worse is that in 10.3, the number of rows is actually much smaller than in 9.6 because I am doing this query on a partitioned table (table name "match") with a reduced data set.
Query plans for both are attached, plus the query.
thanks
--Cory
Insert on public.orphaned_matches (cost=204030825.83..204247350.03 rows=8660968 width=264) CTE removed -> Delete on public.match m (cost=0.00..204030825.83 rows=8660968 width=6) Output: m.id, m.created_at, m.modified_at, m.household_id, m.property_id, m.match_reason, m.full_name, m.first_name, m.middle_name, m.last_name, m.account_id, m.candidate_id, m.match_category, m.confidence, m.owner_id, m.match_resource -> Seq Scan on public.match m (cost=0.00..204030825.83 rows=8660968 width=6) Output: m.ctid Filter: ((NOT (SubPlan 1)) OR ((m.property_id IS NOT NULL) AND (NOT (SubPlan 2)))) SubPlan 1 -> Index Only Scan using uq_household_id on build.household h (cost=0.57..8.59 rows=1 width=0) Index Cond: (h.household_id = (m.household_id)::text) SubPlan 2 -> Index Scan using property_property_id_idx on build.property p (cost=0.57..8.59 rows=1 width=0) Index Cond: (p.property_id = m.property_id) Filter: (p.household_id = (m.household_id)::text) -> CTE Scan on removed (cost=0.00..216524.20 rows=8660968 width=264) Output: nextval('orphaned_matches_id_seq'::regclass), now(), removed.household_id, removed.account_id, removed.candidate_id, removed.created_at, removed.full_name, removed.first_name, removed.last_name, removed.match_reason, removed.property_id, removed.owner_id (16 rows)
Insert on match.orphaned_matches (cost=1823761513653.51..1823761525043.91 rows=455616 width=380) CTE removed -> Delete on match m (cost=0.00..1823761513653.51 rows=455616 width=6) Output: m.id, m.created_at, m.modified_at, m.household_id, m.account_id, m.candidate_id, m.match_reason, m.property_id, m.full_name, m.first_name, m.middle_name, m.last_name, m.match_category, m.confidence, m.owner_id, m.match_resource -> Seq Scan on match m (cost=0.00..1823761513653.51 rows=455616 width=6) Output: m.ctid Filter: ((NOT (SubPlan 1)) OR ((m.property_id IS NOT NULL) AND (NOT (SubPlan 2)))) SubPlan 1 -> Seq Scan on build.household h (cost=0.00..2948996.80 rows=1 width=0) Filter: (h.household_id = (m.household_id)::text) SubPlan 2 -> Index Scan using uq_idx_property_id on build.property p (cost=0.57..2.59 rows=1 width=0) Index Cond: (p.property_id = m.property_id) Filter: (p.household_id = (m.household_id)::text) -> CTE Scan on removed (cost=0.00..11390.40 rows=455616 width=380) Output: nextval('orphaned_matches_id_seq'::regclass), now(), removed.household_id, removed.account_id, removed.candidate_id, removed.created_at, removed.full_name, removed.first_name, removed.last_name, removed.match_reason, removed.property_id, removed.owner_id (16 rows)
Attachment:
query.sql
Description: Binary data