On Tue, May 1, 2012 at 7:22 AM, P J <pauljflists@xxxxxxxxx> wrote:On Mon, Apr 30, 2012 at 10:37 AM, P J <pauljflists@xxxxxxxxx> wrote:cat /proc/$(pidof -s httpd)/limitsTo troubleshoot that you should have
at least two additional outputs from
netstat -pant, with connections states
and
service httpd fullstatus, listing current state of all the apache procs/threads.
What applications your Apache is serving?
PHP? is it mod_php, mod_python, mod_perl?
What the vhost access log file for the most accessed vhost is showing?
Any pattern of slow, connections consuming attack?
If it is, and all tasks are in the Keep Alive wait then disable Keep
Alive and lower the general timeout to just 7 seconds.
The error "connect to listener on [::]:80" error is quite unusual.
ETIMEDOUT
Timeout while attempting connection. The server may be too busy to
accept new connections. Note that for IP sockets the timeout may be
very long when syncookies are enabled on the server.
cat /proc/sys/fs/file-nr
cat /proc/$(pidof -s httpd)/limits
Sincerely,
Alexandr NormalexHi Alexandr, thanks for taking a look at this with me.The traffic pattern for this website is at certain times of the day it receives huge spikes of traffic in very short periods of time, trying to tune Apache to accommodate it the best we can.cat /proc/$(pidof -s httpd)/limitsLimit Soft Limit Hard Limit UnitsMax cpu time unlimited unlimited secondsMax file size unlimited unlimited bytesMax data size unlimited unlimited bytesMax stack size 10485760 unlimited bytesMax core file size 0 unlimited bytesMax resident set unlimited unlimited bytesMax processes 55296 55296 processesMax open files 1024 1024 filesMax locked memory 32768 32768 bytesMax address space unlimited unlimited bytesMax file locks unlimited unlimited locksMax pending signals 55296 55296 signalsMax msgqueue size 819200 819200 bytesMax nice priority 0 0Max realtime priority 0 0cat /proc/sys/fs/file-nr1530 0 560543Looking at Max open files I see what is likely the problem :)Max open files 1024I swear I modified this to 4096! I've changed the limit to 4096 now, I'll double check it tomorrow. Hopefully this will be the obvious fix!I will check service httpd fullstatus and netstat -pant tomorrow morning when this happens again, it happens the same time every day - it is not an attack, the customers application receives massive amounts of connections at certain times of the day.I've been working with Apache for 15 years and I've never seen "connect to listener on [::]:80" error message before, I hope it's related to reaching Max open files.Thanks again for your help.--PJ
I was hoping this would be fixed now that Max Open files has been updated, same issue this morning.cat /proc/$(pidof -s httpd)/limitsLimit Soft Limit Hard Limit UnitsMax cpu time unlimited unlimited secondsMax file size unlimited unlimited bytesMax data size unlimited unlimited bytesMax stack size 10485760 unlimited bytesMax core file size 0 unlimited bytesMax resident set unlimited unlimited bytesMax processes 55296 55296 processesMax open files 1024 1024 filesMax locked memory 32768 32768 bytesMax address space unlimited unlimited bytesMax file locks unlimited unlimited locksMax pending signals 55296 55296 signalsMax msgqueue size 819200 819200 bytesMax nice priority 0 0Max realtime priority 0 0Once it reaches 1000 total children[info] server seems busy, (you may need to increase StartServers, or Min/MaxSpareServers), spawning 32 children, there are 17 idle, and 1002 total childrenAfter 1000 total childrenmpm_common.c(663): (70007)The timeout specified has expired: connect to listener on [::]:80mpm_common.c(663): (70007)The timeout specified has expired: connect to listener on [::]:80mpm_common.c(663): (70007)The timeout specified has expired: connect to listener on [::]:80Until apache is restarted.I tried to run service httpd fullstatus during this time but it want able to connect:ELinks: Connection refused.I did capture the output of netstat -pant which shows many connections to the MySQL DB as well.I've double checked MySQL has not reached max connections and that it's still working during this time.netstat output is so big I have to put it up on pastebin:I dont understand why this is happening at 1000 children, what limit is it hitting?Apache config:Timeout 30
KeepAlive OnMaxKeepAliveRequests 10000KeepAliveTimeout 3<IfModule prefork.c>StartServers 80MinSpareServers 50MaxSpareServers 120ServerLimit 3500MaxClients 3500MaxRequestsPerChild 4000</IfModuleAny help would be greatly appreciated.--PJHaha, Max open files still says 1024!! I hardcoded it to 16384 yesterday, something keeps resetting it!Let me figure this out before I keep bugging the list :)Thanks,--PJ