Hi, I've been having problems with apache becoming unresponsive, and was wondering if anyone had any suggestions on what the problem might be. Basically, periodically, apache will get into a state where all the workers are stuck reading: Server Version: Apache Server Built: Oct 21 2009 10:54:43 Current Time: Tuesday, 15-Jun-2010 07:57:30 PDT Restart Time: Tuesday, 15-Jun-2010 06:37:33 PDT Parent Server Generation: 0 Server uptime: 1 hour 19 minutes 57 seconds Total accesses: 985801 - Total Traffic: 8.1 GB CPU Usage: u644.89 s203.76 cu3994.75 cs0 - 101% CPU load 206 requests/sec - 1.7 MB/second - 8.6 kB/request 1593 requests currently being processed, 15 idle workers RRRRRRRRRRRRRCRRRRKRRRRRRRRRRRRRRRRRRRKRRRRRRRRRRRRRRRRRRRRRRRRR RRRRRRRRRRRRRRRRRRRRRRRRCRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR RRRRRRRRRRRRRRRRRRRRRRRRRRCRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRKRRRR RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRKRRR RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRCRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR RKRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR RRCRRRRRRRRRRRRRRRRRRRRRRKRRRRRRRRRRRKRKRRRRRRRRRRRRRRRRRRRRRRRR RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRKRRRRRRRRRRRRRRRRRRRR RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRKRRRRRRRRRRRRRR RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRKRRRRRRRRRRRRRRRRRRRRRRRRRRRRRKRR RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR RRRRRRRRRRRRRWRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR RRRRRRRRRRRRRRRWRRRRKKCRRKRKRRRRKRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRC RRKKRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR RRRRRRRRRRRRRRKRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR RRRRRRRRCKRRRRCCRRKRRRRRRRRRRRRRRRKRRRCRRRRRRRRRRRRRCCRRRCRRCRRR RRRRRRRKKRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRWRRRRRKRRRKRRRRRRRRRRRRRW KKRRRRRRRKRRRRWRKRRRRRRRRRRRRRRRWRRRRRRRRRRRR___RRR__RR___R_____ WRR__RRRSS...................................................... ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ This is prior to complete failure - sometimes whatever's blocking gets unblocked before it hits max clients, sometimes it doesn't. I'm running apache 2.0.59 built with openssl 0.9.8n on AIX 6.1 with prefork, and this is virtually all SSL traffic (pretty much everything other than the scoreboard). A restart basically "fixes" the problem, from the perspective that all the workers get killed and after the initial thrashing of starting up new workers. >From my understanding of the READ state above, everything above is stuck in one of two broad categories: - A client made the TCP connection to the server, and is somewhere between the tcp handshake and the end of the HTTP Request info. This suggests it could be a network issue (something's hanging the connections), or an openssl issue (the TLS/SSL negotiation is slow/hanging), or...? - The request has been completed, but we're proxying to somewhere else and waiting for a response from the proxy. This potentially applies in this case, because we do have apache setup to proxy some URLs to another server. There's nothing in the access or error logs jumping out to correlate with this problem either - There are MaxClient issues once it hits that, of course, but nothing related to the BUSY_READ state. When having the problem, I've correlated the scoreboard with the ps/lsof/netstat output, and the second case seems unlikely because I'm not seeing any open connections to the server that apache is proxy'ing to. It feels like there's some shared resource that all the apache workers are trying to access, but I can't figure out what it might be. Any suggestions on a solution, or how I might get more info out of apache as to what it's doing while everyone's in the read state? Are there other broad categories I'm missing as to why the workers might be in the read state? Any further info I could provide to help anyone? My next steps are to dive into the apache source further and see what possible resources it could be blocking on, but I'm hoping someone smarter than me already knows. :) -- Dave Fallon --------------------------------------------------------------------- The official User-To-User support forum of the Apache HTTP Server Project. See <URL:http://httpd.apache.org/userslist.html> for more info. To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx " from the digest: users-digest-unsubscribe@xxxxxxxxxxxxxxxx For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx