Re: Sudden slow down and spike in system CPU causes max_connections to get exhausted

Tom Lane <tgl@xxxxxxxxxxxxx> · Mon, 06 Jan 2014 21:24:00 -0500

"Anand Kumar, Karthik" <Karthik.AnandKumar@xxxxxxxxxxxxxx> writes:
> We run postgres 9.1.11, on Centos 6.3, and an ext2 filesystem
> Everything will run along okay, and every few hours, for about a couple of minutes, postgres will slow way down. A "select 1" query takes between 10 and 15 seconds to run, and the box in general gets lethargic.
> This causes a pile up of connections at the DB, and we run out of max_connections.
> This is accompanied with a steep spike in system CPU and load avg. No spike in user CPU or in I/O.

System CPU only huh?  There have been some reports of such behavior
apparently caused by inefficiencies in the kernel's support of
"transparent huge pages".  See for instance this thread

http://www.postgresql.org/message-id/flat/CABMVzL2y8mRM5C9xxejAyDqe0i1S78RAE3cEATGYNf5Ktz_Zdg@xxxxxxxxxxxxxx

although it looks like in that case the real fix was to reduce the number
of backends.

> We do typically have a lot of idle connections (1500 connections total, over a 1000 idle at any given time). We're in the midst of installing pgbouncer to try and mitigate the problem, but that still doesn't address the root cause.

1500 connections?  What makes you think that itself isn't the root cause?

			regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general