Re: High CPU load on Postgres Server during Peak times!!!!

Andy Colson <andy@xxxxxxxxxxxxxxx> · Thu, 24 Sep 2009 10:32:21 -0500

Shiva Raman wrote:
Hi

Today the load observed very high load . I am pasting the top.

_*TOP *_
top - 12:45:23 up 79 days, 14:42,  1 user,  load average: 45.84, 33.13, 
25.84
Tasks: 394 total,  48 running, 346 sleeping,   0 stopped,   0 zombie
Cpu(s): 49.2%us,  0.8%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.1%si, 
50.0%st
Mem:  16133676k total, 14870736k used,  1262940k free,   475484k buffers
Swap: 14466492k total,      124k used, 14466368k free, 11423616k cached

_*OUTPUT OF IOSTAT 1 5 (is SAN becoming a bottleneck,shows 50% CPU 
usage?) *_

clusternode2:~ # iostat 1 5
Linux 2.6.16.46-0.12-ppc64 (clusternode2)       09/24/2009      _ppc64_ 
(4 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          16.00    0.00    0.68    0.61   10.72   71.99

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          29.46    0.00    0.25    0.00    7.43   62.87

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          46.17    0.00    0.99    0.00   38.52   14.32

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          48.88    0.00    0.99    0.00   49.88    0.25

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          47.86    0.00    2.14    0.00   50.00    0.00

Both top and iostat show no wait time for io.  However, they both show 
wait time on the vm.  You have 50% user and 50% steal, and zero% io.

you said: "SAN becoming a bottleneck,shows 50% CPU usage?"

I'm not sure what you are looking at.  SAN is like HD right?  I assume 
waiting on the SAN would show up as %iowait... yes?

Andy Colson Wrote :  ,
/Eww.  I think that's bad.  A connection that has a transaction open 
will cause lots of row versions, which use up ram, and make it slower to 
step through the table (even with an index).  You really need to fix up 
your code and make sure you commit transactions.  (any statement 
(select, insert, update) will start a new transaction that you need to 
explicitly commit).

/With reference to this suggestion by Andy Colson, we checked the 
application code and found that onlyINSERT, UPDATE  has COMMIT  and 
SELECT has no commit, We are using a lot of  "Ajax Suggest" in the all 
the forms accessed for fetching the data using SELECT statements which 
are not explicitly commited. We have started updating the code on this.

Thanks for this  suggestion.

Again thanks to suggestion of Scott Marlowe in reducing the number of 
connections. This was now reducted to 500 .

As i mentioned in the mail, i am restarting the database every 30 
minutes. I found a shell script in the wiki which could the idle in 
transaction pids. This is the code. The code will kill all old pids in 
the server.

This is the script

/usr/bin/test `/usr/bin/pgrep -f 'idle in transaction' | \

  /usr/bin/wc -l ` -gt 20 && /usr/bin/pkill -o -f 'idle in transaction'

and this is the link where the script was provided.

http://wiki.dspace.org/index.php/Idle_In_Transaction_Problem

I tried it run it as test in the server, but the script is not 
executing. Even i see many of the "Idle in transaction " PIDs are 
showing R (RUnning status) , but most of them are showing S(Sleep ) 
status. Please suggest anyway i can resolve this idle transaction issue.

fixing up the code to commit selects will make the "idle in trans." go 
away.  I'm with Praveen, fix the code, avoid the scripts.

Is there anything else running on this box?  You said previously "The 
PowerPC cpu is having some virtual layer that is shown in the Steal 
value.".  I'm not sure what that means.  Are you in a virtual machine? 
Or running other vm's?  Based on the top you posted (this one and the 
very first one) you are loosing half your cpu to the vm.  (unless I'm 
totally reading this wrong... I don't have experience with vm's so 
please someone jump in here and correct me if I'm wrong)

-Andy

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance