Re: Possible performance regression in PostgreSQL 9.2/9.3?

Linos <info@xxxxxxxx> · Thu, 05 Jun 2014 15:36:37 +0200

On 05/06/14 15:29, Igor Neyman wrote:
>> -----Original Message-----
>> From: pgsql-performance-owner@xxxxxxxxxxxxxx [mailto:pgsql-
>> performance-owner@xxxxxxxxxxxxxx] On Behalf Of Linos
>> Sent: Wednesday, June 04, 2014 6:10 PM
>> To: Merlin Moncure
>> Cc: pgsql-performance@xxxxxxxxxxxxxx
>> Subject: Re: [PERFORM] Possible performance regression in PostgreSQL
>> 9.2/9.3?
>>
>> On 04/06/14 22:57, Merlin Moncure wrote:
>>> On Wed, Jun 4, 2014 at 2:58 PM, Linos <info@xxxxxxxx> wrote:
>>>> On 04/06/14 21:36, Merlin Moncure wrote:
>>>>> On Wed, Jun 4, 2014 at 8:56 AM, Linos <info@xxxxxxxx> wrote:
>>>>>> Hello,
>>>>>>
>>>>>> Some days ago I upgraded from 8.4 to 9.3, after the upgrade some
>> queries started performing a lot slower, the query I am using in this example
>> is pasted here:
>>>>>> http://pastebin.com/71DjEC21
>>>>>>
>>>>>>
>>>>>> Considering it is a production database users are complaining because
>> queries are much slower than before, so I tried to downgrade to 9.2 with the
>> same result as 9.3, I finally restored the database on 8.4 and the query is as
>> fast as before.
>>>>>> All this tests are done on Debian Squeeze with 2.6.32-5-amd64 kernel
>> version, the hardware is Intel Xeon E5520, 32Gb ECC RAM, the storage is
>> software RAID 10 with 4 SEAGATE ST3146356SS SAS drives.
>>>>>> postgresql.conf:
>>>>>> max_connections = 250
>>>>>> shared_buffers = 6144MB
>>>>>> temp_buffers = 8MB
>>>>>> max_prepared_transactions = 0
>>>>>> work_mem = 24MB
>>>>>> maintenance_work_mem = 384MB
>>>>>> max_stack_depth = 7MB
>>>>>> default_statistics_target = 150
>>>>>> effective_cache_size = 24576MB
>>>>>>
>>>>>>
>>>>>> 9.3 explain:
>>>>>> http://explain.depesz.com/s/jP7o
>>>>>>
>>>>>> 9.3 explain analyze:
>>>>>> http://explain.depesz.com/s/6UQT
>>>>>>
>>>>>> 9.2 explain:
>>>>>> http://explain.depesz.com/s/EW1g
>>>>>>
>>>>>> 8.4 explain:
>>>>>> http://explain.depesz.com/s/iAba
>>>>>>
>>>>>> 8.4 explain analyze:
>>>>>> http://explain.depesz.com/s/MPt
>>>>>>
>>>>>> It seems to me that the total estimated cost went too high in 9.2 and
>> 9.3 but I am not sure why, I tried commenting out part of the query and
>> disabling indexonlyscan but still I have very bad timings and estimates.
>>>>>> The dump file is the same for all versions and after the restore process
>> ended I did vacuum analyze on the restored database in all versions.
>>>>>> http://www.postgresql.org/mailpref/pgsql-performance
>>>>> The rowcount estimates are garbage on all versions so a good
>>>>> execution plan can be chalked up to chance.  That being said, it
>>>>> seems like we're getting an awful lot of regressions of this type
>>>>> with recent versions.
>>>>>
>>>>> Can you try re-running this query with enable_nestloop and/or
>>>>> enable_material disabled? (you can disable them for a particular
>>>>> session via: set enable_material = false;) .   This is a "ghetto fix"
>>>>> but worth trying.  If it was me, I'd be simplifying and optimizing
>>>>> the query.
>>>>>
>>>>> merlin
>>>>>
>>>>>
>>>> Much better with this options set to false, thank you Merlin, even
>>>> better than 8.4
>>>>
>>>> 9.3 explain analyze with enable_nestloop and enable_material set to
>> false.
>>>> http://explain.depesz.com/s/94D
>>>>
>>>> The thing is I have plenty of queries that are now a lot slower than before,
>> this is only one example. I would like to find a fix or workaround.
>>>> I can downgrade to 9.1, I didn't try on 9.1 but it's the first version that
>> supports exceptions inside plpython and I would like to use them. Do you
>> think this situation would be better on 9.1?
>>>> Or maybe can I disable material and nestloop on postgresql.conf? I
>> thought was bad to trick the planner but given this strange behavior I am not
>> sure anymore.
>>> I would against advise adjusting postgresql.conf.  nestloops often
>>> give worse plans than other choices but can often give the best plan,
>>> sometimes by an order of magnitude or more.  planner directives should
>>> be considered a 'last resort' fix and should generally not be changed
>>> in postgresql.conf.  If i were in your shoes, I'd be breaking the
>>> query down and figuring out where it goes off the rails.   Best case
>>> scenario, you have a simplified, test case reproducible reduction of
>>> the problem that can help direct changes to the planner.  In lieu of
>>> that, I'd look at this as a special case optimization of problem
>>> queries.
>>>
>>> There is something else to try.  Can you (temporarily) raise
>>> join_collapse_limit higher (to, say 20), and see if you get a better
>>> plan (with and without other planner adjustments)?
>>>
>>> merlin
>>>
>>>
>> This is the plan with join_collapse_limit=20, enable_nestloop=false,
>> enable_material=false:
>> http://explain.depesz.com/s/PpL
>>
>> The plan with join_collapse_limit=20 but nestloops and enable_material true
>> is taking too much time, seems to have the same problem as with
>> join_collapse_limit=8.
>>
>> I will try to create a simpler reproducible example, thank you.
>>
>> Regards,
>> Miguel Angel.
>>
>>
>>
>> --
>> Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-performance
> Usually, when I increase join_collapse_limit, I also increase from_collaps_limit (to the same value).
>
> Regards,
> Igor Neyman
>
>

I tried that already and it didn't work, thank you Igor.

I have created a more complete example of this problem in pgsql-hackers list at:
http://www.postgresql.org/message-id/5390554B.5080503@xxxxxxxx

To continue the conversation there.

Regards,
Miguel Angel.