Search Postgresql Archives

Re: 9.5 regression with unwanted nested loop left join

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Greg Sabino Mullane <greg@xxxxxxxxxxxx> writes:
> We are trying to upgrade a client app to Postgres 9.5, but are running 
> into some performance regression issues (even though the curent db is 8.x!). 
> One in particular that is puzzling me involves a query that keeps slipping 
> into a nested loop left join, rather than a much preferred hash join. 

The core of the problem seems to be here:

>          ->  Hash Join  (C=20929.52..44799.53 R=1 W=4) (AT=626.631..1009.945 R=124 L=1)
>                Hash Cond: ((gs.id = gregtest_status.id) AND (gs.update_time = (min(gregtest_status.update_time))))

which is implementing the gs/gs2 join here:

>            SELECT gs.id, gs.status
>            FROM gregtest_status gs
>            JOIN (
>                  SELECT id, min(update_time) AS update_time
>                  FROM gregtest_status
>                  WHERE update_time >= '2015-01-01'::date
>                  GROUP BY id
>            ) gs2 ON gs.id = gs2.id AND gs.update_time = gs2.update_time

I tried this back to 8.4, and all versions predict just one row returned
from the gs/gs2 join, whereas in reality you get 124 rows, ie, all rows
produced by gs2 have matches in gs.

The plan 9.0 and up produce is not just a nestloop above this, but a
nestloop with inner materialization step.  It would clearly not be
sensible to run this hashjoin many times, but materializing its output
gets rid of the cost-of-data problem --- and then, if you believe there's
only gonna be one row out, a simple nestloop looks cheaper than building
a hash table, at least up to a fairly large number of rows on the other
side.

8.4 avoids this trap only because it doesn't consider injecting a
materialize there.

So a brute-force fix to restore the pre-9.0 behavior would be
"set enable_material = off".  But really the problem is that it's
unobvious that all rows in the gs2 output would have matches in gs.
I wonder if there's a way to reformulate the query to make that better.
Do you even need the update_time check?

Assuming that the application's already been optimized for pre-9.0
Postgres, turning off enable_material might not be a disastrous long
term solution for it, though certainly it'd be better if you can move
away from that eventually.

			regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux