On Sun, Dec 2, 2012 at 9:08 AM, rahul143 <rk204885@xxxxxxxxx> wrote: > Hello everyone, > > I'm seeking help in diagnosing / figuring out the issue that we have with > our DB server: > > Under some (relatively non-heavy) load: 300...400 TPS, every 10-30 seconds > server drops into high cpu system usage (90%+ SYSTEM across all CPUs - it's > pure SYS cpu, i.e. it's not io wait, not irq, not user). Postgresql is > taking 10-15% at the same time. Those periods would last from few seconds, > to minutes or until Postgresql is restarted. Needless to say that system is > barely responsive, with load average hitting over 100. We have mostly select > statements (joins across few tables), using indexes and resulting in a small > number of records returned. Should number of requests per second coming drop > a bit, server does not fall into those HIGH-SYS-CPU periods. It all seems > like postgres runs out of some resources or fighting for some locks and that > causing kernel to go into la-la land trying to manage it. > > > So far we've checked: > - disk and nic delays / errors / utilization > - WAL files (created rarely) > - tables are vacuumed OK. periods of high SYS not tied to vacuum process. > - kernel resources utilization (sufficient FS handles, shared MEM/SEM, VM) > - increased log level, but nothing suspicious/different (to me) is reported > there during periods of high sys-cpu > - ran pgbench (could not reproduce the issue, even though it was producing > over 40,000 TPS for prolonged period of time) > > Basically, our symptoms are exactly as was reported here over a year ago > (though for postgres 8.3, we ran 9.1): > http://archives.postgresql.org/pgsql-general/2011-10/msg00998.php > > I will be grateful for any ideas helping to resolve or diagnose this > problem. Didn't we just discuss this exact problem on the identically named thread? http://postgresql.1045698.n5.nabble.com/High-SYS-CPU-need-advise-td5732045.html If you're the same poster, it's good to reference the thread and any conclusions made in order to save everyone's time. As at happens, I have been working an angle that may help solve this problem. Are you willing/able to run patched postgres and what's your tolerance for risk? merlin -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general