Hey guys, I am in the middle of testing out a database migration from 9.6 to 10.3. We have a quasi-multi tenant based application and so are utilizing native partitions on some relations to help improve some performance.
I was issuing a query on both databases to cleanup some duplicates in preparation of applying new indexes. On the 9.6 database with all the data in one table, the query runs fine in about 6 min. On 10.3, with a work_mem setting of 1GB the query runs for about 7 minutes and then gets terminated with an out of memory error. If I bump the work_mem up fairly high (12GB out of 52GB available) the server actually crashes.
On the both databases, the total dataset size is exactly the same, ~29M records. The table looks like this:
Table "candidate_person"
Column | Type | Collation | Nullable | Default
--------------+--------------------------+-----------+----------+---------
created_at | timestamp with time zone | | not null | now()
modified_at | timestamp with time zone | | not null | now()
account_id | bigint | | not null |
candidate_id | character varying(40) | C | not null |
first_name | text | | |
middle_name | text | | |
last_name | text | | |
spouse | boolean | | not null | false
suffix | text | | |
salutation | text | | |
Partition key: LIST (account_id)
With the only difference being on 9.6 there obviously isn't any partitions. On 10.3 there are ~250 partition tables.
I have attached the server log that shows the first out of memory and then the server crash and recovery.
Attachment:
db_crash.log
Description: Binary data