It does strike me odd if there were continued request why didn’t pick right back up after the reboot. The reboot would take less than 3 minutes.
Also a connection attaching to the load balancer, the servers are not directly addressable, would have been routed to one of the other servers while Apache was down. Darryl Baker
(he/him/his) Sr. System Administrator Distributed Application Platform Services Northwestern University 1800 Sherman Ave. Suite 6-600 – Box #39 Evanston, IL 60201-3715 (847) 467-6674 From: "Rose, John B" <jbrose@xxxxxxx> I don't think the TCP buffer would be clear if there was a continuing flow of http requests during that time, whether the web server software was down, or maxed out But maybe I am wrong. From: Darryl Philip Baker <darryl.baker@xxxxxxxxxxxxxxxx> No PHP on the system at all. The web server was down for 15-20 minutes so anything in the queue should have cleared, right? Darryl Baker
(he/him/his) Sr. System Administrator Distributed Application Platform Services Northwestern University 1800 Sherman Ave. Suite 6-600 – Box #39 Evanston, IL 60201-3715 (847) 467-6674 From:
"Rose, John B" <jbrose@xxxxxxx> Regarding the "load increasing quickly after restarting the daemons" ... I do not believe just restarting the daemons clears the TCP queue. Nor does it prevent new TCP requests. If it is an attack, then the load would ramp back up immediately. That is why
you have to reboot I am guessing. Do you utilize PHP? PHP-FPM? Do you use TCP or Unix Domain sockets? Are there a preponderance of http connections or PHP-FPM processes, or both? If PHP-FPM do you use "static" "dynamic" or "ondemand"? From: Darryl Philip Baker <darryl.baker@xxxxxxxxxxxxxxxx> Gentlefolk, I had an incident yesterday where the Apache web server host had a load average of over 170 and was performing very slowly. Stopping the web server did fix the issue but when I restarted
the daemons the load started to increase very quickly. I ended up having to reboot the system to fix the issue. I don’t like that one bit, this is a Linux system not a Windows server. (Editorial remark: I have found that systems need reboots to fix stuff much
more frequently since the adoption of systemd) I have been asked to do a root cause analysis, but I have not found anything as of yet. I am reaching out for help in this matter. The system is a RHEL7 ESX VM with the Red Hat’s main line distribution of Apache 2.4 as opposed to the RHSCL version. The configuration is quite complex and a bit sensitive so I
cannot share all of that. What I’m looking for is technics to look at what happened rather than being given the answer anyway. Darryl Baker
(he/him/his) Sr. System Administrator Distributed Application Platform Services Northwestern University 1800 Sherman Ave. Suite 6-600 – Box #39 Evanston, IL 60201-3715 (847) 467-6674 |