Join order optimization

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I'm from the Hibernate team (Java ORM) and a user recently reported that a change in our SQL rendering affected his query plans in a bad way.

In short, we decided to model certain constructs in our ORM with "nested joins" i.e. using parenthesis to express the join order. This is what we want to model semantically, though there are cases when we could detect that we don't need the explicit join ordering to get the same semantics. I expected that the PostgreSQL optimizer can do the same reasoning and do further join re-ordering to produce an optimal plan, but to my surprise it seems it isn't capable to do that.

The query we generate right now is of the following structure:

from tbl1 t1
join (tbl2 t2
    left join tbl3 t3 on t3.pkAndFk = t2.pk
    left join tbl4 t4 on t4.pkAndFk = t2.pk
    ...
    left join tbl9 t9 on t9.pkAndFk = t2.pk
) on t2.fk = t1.pk
where t1.indexedColumn = ...

whereas the query we generated before, which is semantically equivalent, is the following:

from tbl1 t1
join tbl2 t2 on t2.fk = t1.pk
left join tbl3 t3 on t3.pkAndFk = t2.pk
left join tbl4 t4 on t4.pkAndFk = t2.pk
...
left join tbl9 t9 on t9.pkAndFk = t2.pk
where t1.indexedColumn = ...


You can find the full queries in the attachments section of the issue report from the user: https://hibernate.atlassian.net/browse/HHH-16595

Query_Hibernate5.txt shows the old style query without parenthesis and Query_Hibernate6.txt shows the new style. You will also find the query plans for the two queries attached as CSV files.

It almost seems like the PostgreSQL optimizer sees the parenthesis for join ordering as an optimization fence!?

The user reported that the behavior is reproducible in PostgreSQL versions 11 and 15. He promised to provide a full reproducer for this which I am still waiting for, but I'll share it with you as soon as that was provided if needed.


I think that we can detect that the parenthesis is unnecessary in this particular case, but ideally PostgreSQL would be able to detect this as well to plan the optimal join order. Any ideas what is going on here? Is this a bug or missed optimization in the query optimizer?

I'm a bit worried about what PostgreSQL will produce for queries that really need the parenthesis for join ordering e.g.

from tbl1 t1
left join (tbl2 t2
    join tbl3 t3 on t3.pkAndFk = t2.pk
    join tbl4 t4 on t4.pkAndFk = t2.pk
    ...
    join tbl9 t9 on t9.pkAndFk = t2.pk
) on t2.fk = t1.pk
where t1.indexedColumn = ...

Thanks for any help.

Christian






[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux