Update on this:
We did a switchover to another machine with the same hardware, however
this system was running on some older parameters we had set in the
postgresql.conf file.
So we went from 400 max_connections to 200 max_connections, and 160MB
work_mem to 200MB work_mem. And now on this other system, so far it
seems to be running ok.
Other than the obvious fact that each connection has a certain amount of
memory usage, is there something else to watch for when increasing
connections to numbers like 400? When we had the issue of the system
jumping to 100% cpu usage, even at that point our number of backends to
the cluster was at MAX 250, but generally in the 175 range, so well
below our 400 max_connections we allow. So could this be the culprit?
I'll be watching the cluster as we run on the new configuration (with
only 200 max_connections).
- Brian F
On 10/27/2011 03:22 PM, Brian Fehrle wrote:
On 10/27/2011 02:50 PM, Tom Lane wrote:
Brian Fehrle<brianf@xxxxxxxxxxxxxxxxxxx> writes:
Hi all, need some help/clues on tracking down a performance issue.
PostgreSQL version: 8.3.11
I've got a system that has 32 cores and 128 gigs of ram. We have
connection pooling set up, with about 100 - 200 persistent connections
open to the database. Our applications then use these connections to
query the database constantly, but when a connection isn't currently
executing a query, it's<IDLE>. On average, at any given time, there are
3 - 6 connections that are actually executing a query, while the rest
are<IDLE>.
About once a day, queries that normally take just a few seconds slow
way
down, and start to pile up, to the point where instead of just having
3-6 queries running at any given time, we get 100 - 200. The whole
system comes to a crawl, and looking at top, the CPU usage is 99%.
This is jumping to a conclusion based on insufficient data, but what you
describe sounds a bit like the sinval queue contention problems that we
fixed in 8.4. Some prior reports of that:
http://archives.postgresql.org/pgsql-performance/2008-01/msg00001.php
http://archives.postgresql.org/pgsql-performance/2010-06/msg00452.php
If your symptoms match those, the best fix would be to update to 8.4.x
or later, but a stopgap solution would be to cut down on the number of
idle backends.
regards, tom lane
That sounds somewhat close to the same issue I am seeing. Main
differences being that my spike lasts for much longer than a few
minutes, and can only be resolved when the cluster is restarted. Also,
that second link shows TOP where much of the CPU is via the 'user',
rather than the 'sys' like mine.
Is there anything I can look at more to get more info on this 'sinval
que contention problem'?
Also, having my cpu usage high in 'sys' rather than 'us', could that
be a red flag? Or is that normal?
- Brian F
--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general