Re: Sudden connection and load average spikes with postgresql 9.3

Josh Berkus <josh@xxxxxxxxxxxx> · Tue, 30 Jun 2015 15:56:48 -0700

On 06/30/2015 07:52 AM, eudald_v wrote:
> Two days from now, I've been experiencing that, randomly, the connections
> rise up till they reach max connections, and the load average of the server
> goes arround 300~400, making every command issued on the server take
> forever. When this happens, ram is relatively low (70Gb used), cores
> activity is lower than usual and sometimes swap happens (I've swappiness
> configured to 10%)

As Tom said, the most likely reason for this is application behavior and
blocking locks.  Try some of these queries on our scripts page:

https://github.com/pgexperts/pgx_scripts/tree/master/locks

However, I have seem some other things which cause these kinds of stalls:

* runaway connection generation by the application, due to either a
programming bug or an irresponsible web crawler (see
https://www.pgexperts.com/blog/quinn_weaver/)

* issues evicting blocks from shared_buffers: what is your
shared_buffers set to?  How large is your database?

* Checkpoint stalls: what FS are you on?  What are your transaction log
settings for PostgreSQL?

* Issues with the software/hardware stack around your storage, causing
total IO stalls periodically.  What does IO throughput look like
before/during/after the stalls?

The last was the cause the last time I dealt with a situation like
yours; it turned out the issue was bad RAID card firmware where the card
would lock up whenever the write-through buffer got too much pressure.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance