In our application after upgrading from postgresql from v15 to v16 performance for some queries dropped from less than 1-2 seconds to 2+ minutes.
After checking out some blogposts regarding DISTINCT improvements in v16 i was able to "fix" it by adding order by clauses in subqueries, still it looks more like a workaround for me and i would like to ask if it is working as expected now.
I was able to reproduce the issue by the following steps
1. start postgresql v16 in docker
docker run --name some-postgres16 -e POSTGRES_PASSWORD=mysecretpassword -d postgres:16
- data is originated by northwind db, but tables were created with different schema by custom software
- each table has 8k rows
- _BDN_Terretories__202406100822.sql and _BDN_EmployeeTerritories__202406100822.sql files are attached
3. Execute the following query
select distinct tbl1."BDN_EmployeeTerritories_ID", tbl2."BDN_Terretories_ID", tbl1."Reference_Date" from
(
select "BDN_EmployeeTerritories_ID", "Reference_Date", token
from public."BDN_EmployeeTerritories",
unnest(string_to_array("EMP_TerretoryID", ';')) s(token)
--order by token
) tbl1
join (
select "BDN_Terretories_ID", token from public."BDN_Terretories", unnest(string_to_array("EMP_TerretoryID", ';')) s(token)
) tbl2 on tbl1.token = tbl2.token
Observations:
1. query runs for 4-5 seconds on v16 and less than a second on v15
2. in v16 it also goes downs to less than a second if
2.1 distinct is removed
2.2 unnest is removed. it is not really needed for this particular data but this query is autogenerated and unnest makes sense for other data
2.3 "order by token" is uncommented, this is my current way of fixing the problem I would really appreciate some feedback if that is expected behaviour and if there are better solutions
Vitalii Lytovskyi
Attachment:
_BDN_EmployeeTerritories__202406100822.sql
Description: Binary data
Attachment:
_BDN_Terretories__202406100822.sql
Description: Binary data