Wow so no help? No one has any ideas on what might be going on? Is this even the correct forum to try and get help? Would it help if I mentioned I can not duplicate this in my DEV nor QA server environment? That only the production servers are seeing this issue and the only difference from Prod and QA/DEV is the server hardware? I wanted to leave upgradng apache as my last resort, but if I can not get help I guess that is the only thing to try and fix it right now. On Tue, Nov 4, 2008 at 4:53 PM, Jason Cox <cscoman@xxxxxxxxx> wrote: > Bump.... > > On Thu, Oct 23, 2008 at 6:10 PM, Jason Cox <cscoman@xxxxxxxxx> wrote: >> I am seeing this issue on my web servers. It all started a couple of >> weeks ago. I looked at my change logs and see no changes that would >> correlate to this happening. >> So this is what happens. A process reaches the MaxRequestsPerChild, in >> this case 10000, it dies off but the parent process still thinks it is >> running. When I look at the server-status page the processes are in a >> "Sending Reply" state. >> >> Here is a score board from one of my servers: >> >> Scoreboard: WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW_____________________W_WWW_________________.................................................................................................................................................................... >> >> Here is a small part of what I see on the server-status page: >> >> 0-0 21171 1/9999/9999 W 12.31 3081 15 0.0 76.34 76.34 >> 10.241.48.117 www.blah.com GET /js/s_code_remote.js HTTP/1.0 >> 1-0 21172 1/9995/9995 W 12.65 2719 33 0.0 65.51 65.51 >> 10.241.48.149 free.blah.com GET /js/player.js HTTP/1.0 >> 2-0 21173 1/9996/9996 W 12.37 1725 17 0.0 67.84 67.84 >> 10.241.48.119 free.blah.com GET /js/site_catalyst.js HTTP/1.0 >> 3-0 21174 1/9996/9996 W 12.46 2718 3 0.0 66.14 66.14 >> 10.241.48.118 free.blah.com GET /js/cookies.js HTTP/1.0 >> 4-0 21175 1/9997/9997 W 12.23 2711 95 0.7 70.41 70.41 >> 10.241.48.147 free.blah.com GET /playXML/track/13356268?r=0.33 >> HTTP/1.0 >> 5-0 21176 1/9993/9993 W 12.59 2233 7 0.0 65.62 65.62 >> 10.241.48.148 free.blah.com GET /js/google_adsense.js HTTP/1.0 >> 6-0 21177 1/10000/10000 W 12.41 2367 15 0.0 64.63 64.63 >> 10.241.48.117 free.blah.com GET /js/global.js HTTP/1.0 >> 7-0 21178 1/9997/9997 W 12.49 2606 3 0.0 63.83 63.83 >> 10.241.48.148 free.blah.com GET /js/table_constructor.js HTTP/1.0 >> 8-0 21179 1/9997/9997 W 12.32 2900 17 0.0 67.04 67.04 >> 10.241.48.117 free.blah.com GET /js/cookies.js HTTP/1.0 >> >> >> Below is some stuff I captured from a web server that stopped >> responding to web requests all together. >> >> [ ~]# ps ax |grep httpd >> 21457 ? Ss 0:01 /usr/local/www/bin/httpd >> 1414 pts/0 S+ 0:00 grep httpd >> >> [ ~]# lsof -p 21457 >> COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME >> httpd 21457 root cwd DIR 8,1 4096 2 / >> httpd 21457 root rtd DIR 8,1 4096 2 / >> httpd 21457 root txt REG 8,3 501900 471 /usr1/www/bin/httpd >> httpd 21457 root mem REG 0,0 0 [vdso] >> (stat: No such file or directory) >> httpd 21457 root mem REG 8,1 126648 36049 /lib/ld-2.3.5.so >> httpd 21457 root mem REG 8,1 1489572 36050 /lib/libc-2.3.5.so >> httpd 21457 root mem REG 8,1 25476 106215 >> /usr/lib/libgdbm.so.2.0.0 >> httpd 21457 root mem REG 8,1 27660 36057 >> /lib/libcrypt-2.3.5.so >> httpd 21457 root mem REG 8,1 196676 36058 /lib/libm-2.3.5.so >> httpd 21457 root mem REG 8,1 125160 106805 >> /usr/lib/libexpat.so.0.5.0 >> httpd 21457 root mem REG 8,1 46552 31966 >> /lib/libnss_files-2.3.5.so >> httpd 21457 root DEL REG 0,8 983040 /SYSV00000000 >> httpd 21457 root 0r CHR 1,3 1326 /dev/null >> httpd 21457 root 1w CHR 1,3 1326 /dev/null >> httpd 21457 root 2w FIFO 0,6 460663664 pipe >> httpd 21457 root 3r FIFO 0,6 460663693 pipe >> httpd 21457 root 4w FIFO 0,6 460663664 pipe >> httpd 21457 root 5w FIFO 0,6 460663666 pipe >> httpd 21457 root 6w FIFO 0,6 460663668 pipe >> httpd 21457 root 7w FIFO 0,6 460663673 pipe >> httpd 21457 root 8w FIFO 0,6 460663674 pipe >> httpd 21457 root 9w FIFO 0,6 460663693 pipe >> httpd 21457 root 10r FIFO 0,6 460663696 pipe >> httpd 21457 root 11w FIFO 0,6 460663696 pipe >> httpd 21457 root 12r FIFO 0,6 460663699 pipe >> httpd 21457 root 13w FIFO 0,6 460663699 pipe >> httpd 21457 root 14r FIFO 0,6 460663702 pipe >> httpd 21457 root 15u IPv4 460663477 TCP *:http (LISTEN) >> httpd 21457 root 16w FIFO 0,6 460663702 pipe >> httpd 21457 root 17r FIFO 0,6 460663705 pipe >> httpd 21457 root 18w FIFO 0,6 460663705 pipe >> >> [ ~]# strace -p 21457 >> Process 21457 attached - interrupt to quit >> select(0, NULL, NULL, NULL, {0, 540000}) = 0 (Timeout) >> time(NULL) = 1224806756 >> waitpid(-1, 0xbfc31190, WNOHANG) = 0 >> select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout) >> time(NULL) = 1224806757 >> waitpid(-1, 0xbfc31190, WNOHANG) = 0 >> select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout) >> time(NULL) = 1224806758 >> waitpid(-1, 0xbfc31190, WNOHANG) = 0 >> select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout) >> time(NULL) = 1224806759 >> select(19, NULL, [9 11 13 16 18], NULL, {0, 0}) = 5 (out [9 11 13 16 >> 18], left {0, 0}) >> waitpid(-1, 0xbfc31190, WNOHANG) = 0 >> select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout) >> time(NULL) = 1224806760 >> waitpid(-1, 0xbfc31190, WNOHANG) = 0 >> select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout) >> time(NULL) = 1224806761 >> waitpid(-1, 0xbfc31190, WNOHANG) = 0 >> select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout) >> time(NULL) = 1224806762 >> waitpid(-1, 0xbfc31190, WNOHANG) = 0 >> select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout) >> time(NULL) = 1224806763 >> waitpid(-1, 0xbfc31190, WNOHANG) = 0 >> select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout) >> time(NULL) = 1224806764 >> waitpid(-1, 0xbfc31190, WNOHANG) = 0 >> select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout) >> time(NULL) = 1224806765 >> waitpid(-1, 0xbfc31190, WNOHANG) = 0 >> select(0, NULL, NULL, NULL, {1, 0} <unfinished ...> >> Process 21457 detached >> >> Here is the server info. It is FC4: >> >> [ ~]# uname -a >> Linux web10.sd 2.6.17-1.2142smp #1 SMP Thu Sep 14 15:27:50 PDT 2006 >> i686 i686 i386 GNU/Linux >> >> [ ~]# httpd -V >> Server version: Apache/1.3.34 (Unix) >> Server built: Aug 14 2006 16:11:23 >> Server's Module Magic Number: 19990320:18 >> Server compiled with.... >> -D HAVE_MMAP >> -D HAVE_SHMGET >> -D USE_SHMGET_SCOREBOARD >> -D USE_MMAP_FILES >> -D HAVE_FCNTL_SERIALIZED_ACCEPT >> -D HAVE_SYSVSEM_SERIALIZED_ACCEPT >> -D SINGLE_LISTEN_UNSERIALIZED_ACCEPT >> -D DYNAMIC_MODULE_LIMIT=64 >> -D HARD_SERVER_LIMIT=256 >> -D HTTPD_ROOT="/usr/local/www" >> -D SUEXEC_BIN="/usr/local/www/bin/suexec" >> -D DEFAULT_PIDLOG="logs/httpd.pid" >> -D DEFAULT_SCOREBOARD="logs/httpd.scoreboard" >> -D DEFAULT_LOCKFILE="logs/httpd.lock" >> -D DEFAULT_ERRORLOG="logs/error_log" >> -D TYPES_CONFIG_FILE="conf/mime.types" >> -D SERVER_CONFIG_FILE="conf/httpd.conf" >> -D ACCESS_CONFIG_FILE="conf/access.conf" >> -D RESOURCE_CONFIG_FILE="conf/srm.conf" >> >> Here is the top part of my httpd.conf: >> >> ################################# >> ### SECTION 1: Global Environment >> ################################# >> >> ServerType standalone >> Port 80 >> HostnameLookups Off >> User wwwadmin >> Group wwwadmin >> >> >> Listen "80" >> ServerRoot "/usr/local/www" >> DocumentRoot "/usr/local/www/htdocs" >> >> LockFile /var/lock/httpd.lock >> PidFile logs/httpd.pid >> ScoreBoardFile logs/apache_runtime_status >> Timeout 120 >> ExtendedStatus On >> UseCanonicalName On >> ServerSignature Off >> ServerTokens prod >> UserDir disabled >> >> AddDefaultCharset utf-8 >> >> ###SERVER TUNING### >> >> KeepAlive Off >> MaxKeepAliveRequests 100 >> KeepAliveTimeout 15 >> MinSpareServers 10 >> MaxSpareServers 20 >> StartServers 50 >> MaxClients 125 >> MaxRequestsPerChild 10000 >> >> >> If anyone else has seen this behavior before I would appreciate some >> help. Thanks. >> --- >> Jason Cox >> > > > > -- > Jason Cox > -- Jason Cox --------------------------------------------------------------------- The official User-To-User support forum of the Apache HTTP Server Project. See <URL:http://httpd.apache.org/userslist.html> for more info. To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx " from the digest: users-digest-unsubscribe@xxxxxxxxxxxxxxxx For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx