It's an interesting idea, however when I rewrote the original query to use "WHERE EXISTS" rather than "WHERE IN", I get the same bad execution plan. I think this really has to do with the Postgres optimizer's limitations with respect to outer joins.
In my case it's certainly possible to rewrite the query by hand to eliminate the outer join and get the same results.
And after posting the original problem, I have also found that with some work it's possible to make Hibernate generate a query that eliminates the outer join and get the same results.
But I think improving the Postgres optimizer to handle such cases would be a nice improvement. Then again, having lived through many years of Oracle optimizer bugs, it might be easier said than done.
On Tue, Apr 30, 2013 at 3:24 PM, Vitalii Tymchyshyn <tivv00@xxxxxxxxx> wrote:
What I can say is that hibernate has "exists" in both HQL and criteria API (e.g. see http://www.cereslogic.com/pages/2008/09/22/hibernate-criteria-subqueries-exists/ for criteria). So, may be it's easier for you to tune your hibernate query to use exists--2013/4/30 Mark Hampton <mark@xxxxxxxxxxxxx>I have a Hibernate-generated query (That's not going to change, so let's just focus on the Postgres side for now) like this:SELECT *from PERSON pwhere p.PERSON_ID in (select distinct p2.PERSON_IDfrom PERSON p2left outer join PERSON_ALIAS pa onp2.PERSON_ID = pa.PERSON_IDwhere (lower(p1.SURNAME) = 'duck' orlower(pa.SURNAME) = 'duck') and(lower(p1.FORENAME) = 'donald' orlower(pa.FORENAME) = 'donald'))order by p.PERSON_ID asc;There are function-based indexes on PERSON and PERSON_ALIAS as follows:CREATE INDEX PERSON_FORENAME_LOWER_FBIDX ON PERSON (LOWER(FORENAME) VARCHAR_PATTERN_OPS);CREATE INDEX PERSON_SURNAME_LOWER_FBIDX ON PERSON (LOWER(SURNAME) VARCHAR_PATTERN_OPS);CREATE INDEX PERSON_ALIAS_FORENAME_LOWER_FBIDX ON PERSON_ALIAS (LOWER(FORENAME) VARCHAR_PATTERN_OPS);
CREATE INDEX PERSON_ALIAS_SURNAME_LOWER_FBIDX ON PERSON_ALIAS (LOWER(SURNAME) VARCHAR_PATTERN_OPS);The problem is that the above query doesn't use the indexes. The "or" clauses across the outer-join seem to be the culprit. If I rewrite the query as follows, Postgres will use the index:SELECT *from PERSON pwhere (p.PERSON_ID in (select p2.PERSON_IDfrom TRAVELER.PERSON p2join TRAVELER.OTHER_NAME pa on p2.PERSON_ID =pa.PERSON_IDwhere lower(p2.SURNAME) = 'duck' andlower(pa.FORENAME) = 'donald') orp.PERSON_ID in (select p2.PERSON_IDfrom TRAVELER.PERSON p2join TRAVELER.OTHER_NAME pa on p2.PERSON_ID =pa.PERSON_IDwhere lower(pa.SURNAME) = 'duck' andlower(p2.FORENAME) = 'donald') orp.PERSON_ID in (select p2.PERSON_IDfrom TRAVELER.PERSON p2where lower(p2.SURNAME) = 'duck' andlower(p2.FORENAME) = 'donald') orp.PERSON_ID in (select p2.PERSON_IDfrom TRAVELER.OTHER_NAME pawhere lower(pa.SURNAME) = 'duck' andlower(pa.FORENAME) = 'donald'))order by p.PERSON_ID asc;So my question is this: Is there a way to get the Postgres optimizer "rewrite" the query execution plan to use the equivalent, but much more efficient latter form?And before you ask; yes, there are better ways of writing this query. But we're dealing with Java developers and Hibernate here. It's a legacy system, and the policy is to avoid hand-written SQL, so for the moment let's not go down that rabbit hole, and focus on the issue of what the optimizer can and cannot do.
Best regards,
Vitalii Tymchyshyn