Re: upgrade from 9.2.x to 9.3 causes significant performance degradation

Lonni J Friedman <netllama@xxxxxxxxx> · Wed, 18 Sep 2013 10:30:16 -0700

On Wed, Sep 18, 2013 at 2:02 AM, Kevin Grittner <kgrittn@xxxxxxxxx> wrote:
> Lonni J Friedman <netllama@xxxxxxxxx> wrote:
>
>> top shows over 90% of the load is in sys space.  vmstat output
>> seems to suggest that its CPU bound (or bouncing back & forth):
>
> Can you run `perf top` during an episode and see what kernel
> functions are using all that CPU?

I take back what I said earlier.  While the master is currently back
to normal performance, the two hot standby slaves are still churning
something awful.

If I run 'perf top' on either slave, after a few seconds, these are
consistently the top three in the list:
 84.57%  [kernel]               [k] _spin_lock_irqsave
  6.21%  [unknown]              [.] 0x0000000000659f60
  4.69%  [kernel]               [k] compaction_alloc

>
> This looks similar to cases I've seen of THP defrag going wild.
> Did the OS version or configuration change?  Did the PostgreSQL
> memory settings (like shared_buffers) change?

I think you're onto something here with respect to THP defrag going
wild.  I set /sys/kernel/mm/transparent_hugepage/defrag to 'never' and
immediately the load dropped on both slaves from over 5.00 to under
1.00.

So this raises the question, is this a kernel bug, or is there some
other solution to the problem?
Also, seems weird that the problem didn't happen until I switched from
9.2 to 9.3.  Is it possible this is somehow related to the change from
using SysV shared memory to using Posix shared memory and mmap for
memory management?

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general