On Mon, Jan 6, 2014 at 6:24 PM, Tom Lane <tgl@xxxxxxxxxxxxx> wrote: > "Anand Kumar, Karthik" <Karthik.AnandKumar@xxxxxxxxxxxxxx> writes: >> We run postgres 9.1.11, on Centos 6.3, and an ext2 filesystem >> Everything will run along okay, and every few hours, for about a couple of minutes, postgres will slow way down. A "select 1" query takes between 10 and 15 seconds to run, and the box in general gets lethargic. >> This causes a pile up of connections at the DB, and we run out of max_connections. >> This is accompanied with a steep spike in system CPU and load avg. No spike in user CPU or in I/O. > > System CPU only huh? There have been some reports of such behavior > apparently caused by inefficiencies in the kernel's support of > "transparent huge pages". See for instance this thread > > http://www.postgresql.org/message-id/flat/CABMVzL2y8mRM5C9xxejAyDqe0i1S78RAE3cEATGYNf5Ktz_Zdg@xxxxxxxxxxxxxx > > although it looks like in that case the real fix was to reduce the number > of backends. I experienced the THP defragmentation problem even with <10 connections. What always saves me is to set echo always > /sys/kernel/mm/transparent_hugepage/enabled echo madvise > /sys/kernel/mm/transparent_hugepage/defrag , the names might be slightly different on CentOS, like redhat_transparent_hugepage or something like this, I don't remember exactly. -- Kind regards, Sergey Konoplev PostgreSQL Consultant and DBA http://www.linkedin.com/in/grayhemp +1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979 gray.ru@xxxxxxxxx -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general