You will need to do some more triaging. Suggestions for things to investigate more deeply: http Log files, system log files, system performance monitoring, connection statistics, source
of traffic, TCP performance tuning, firewall control, protection against DOS attacks… and that’s just off the top. You will need to profile the system using a myriad of tools that suit your need (e.g. tcpdump, lsof, top, netstat, ss, and a large variety of others, depending on what you learn along
the way). Matt. From: Rose, John B <jbrose@xxxxxxx>
I don't think the TCP buffer would be clear if there was a continuing flow of http requests during that time, whether the web server software was down, or maxed out But maybe I am wrong. From: Darryl Philip Baker <darryl.baker@xxxxxxxxxxxxxxxx> No PHP on the system at all. The web server was down for 15-20 minutes so anything in the queue should have cleared, right? Darryl Baker
(he/him/his) Sr. System Administrator Distributed Application Platform Services Northwestern University 1800 Sherman Ave. Suite 6-600 – Box #39 Evanston, IL 60201-3715 (847) 467-6674 From:
"Rose, John B" <jbrose@xxxxxxx> Regarding the "load increasing quickly after restarting the daemons" ... I do not believe just restarting the daemons clears the TCP queue. Nor does it prevent new TCP requests. If it is an attack, then the load would ramp back up immediately.
That is why you have to reboot I am guessing. Do you utilize PHP? PHP-FPM? Do you use TCP or Unix Domain sockets? Are there a preponderance of http connections or PHP-FPM processes, or both? If PHP-FPM do you use "static" "dynamic" or "ondemand"? From: Darryl Philip Baker <darryl.baker@xxxxxxxxxxxxxxxx> Gentlefolk, I had an incident yesterday where the Apache web server host had a load average of over 170 and was performing very slowly. Stopping the web server did fix the issue
but when I restarted the daemons the load started to increase very quickly. I ended up having to reboot the system to fix the issue. I don’t like that one bit, this is a Linux system not a Windows server. (Editorial remark: I have found that systems need reboots
to fix stuff much more frequently since the adoption of systemd) I have been asked to do a root cause analysis, but I have not found anything as of yet. I am reaching out for help in this matter. The system is a RHEL7 ESX VM with the Red Hat’s main line distribution of Apache 2.4 as opposed to the RHSCL version. The configuration is quite complex and a bit
sensitive so I cannot share all of that. What I’m looking for is technics to look at what happened rather than being given the answer anyway. Darryl Baker
(he/him/his) Sr. System Administrator Distributed Application Platform Services Northwestern University 1800 Sherman Ave. Suite 6-600 – Box #39 Evanston, IL 60201-3715 (847) 467-6674 |