Re: Keepalive closing connections prematurely on high load on newer httpd versions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi

I can see about 8000+ requests have timed out in 'Rocky'. This is mostly due to Apache, which is unable to handle the load. Is it possible to increase the parameter "KeepAliveTimeout" (and other KeepAlive parameters).

Is it also possible for you to post the hardware utilisations for the 2 different servers (Centos & Rocky)?

Deepak
"The greatness of a nation can be judged by the way its animals are treated - Mahatma Gandhi"


"Plant a Tree, Go Green"



On Mon, May 22, 2023 at 3:49 PM Mateusz Kempski <mkempski@xxxxxxxxxxxx.invalid> wrote:
Hi all,
I have two identical VMs - 16GB RAM, 16 vCPUs. One is fresh Centos 7
install, the other is fresh Rocky 8. I installed httpd (on Centos 7
it's version 2.4.6 and on Rocky 8 it's 2.4.37), configured them to
point to the same static default html file and enabled mpm event on
Centos (mpm event is default on Rocky). Then I added following options
to default config on both servers:
```
<IfModule mpm_event_module>
ThreadsPerChild 25
StartServers 3
ServerLimit 120
MinSpareThreads 75
MaxSpareThreads 3000
MaxRequestWorkers 3000
MaxConnectionsPerChild 0
</IfModule>
```
After this is done I performed ab tests with keepalive using different
Centos 7 VM in the same local network. Can you help me understand
these results? On Centos I am able to complete 1 milion requests at
1000 concurrent connections with little to no errors, however with
version 2.4.37 on Rocky I get a lot of failed requests due to length
and exceptions. Served content is static so I am assuming this is
because keepalive connections are closed by the server. I tried
various configurations of KeeAaliveTimeout, MaxKeepAliveRequests,
AsyncRequestWorkerFactor and other mpm_event options but I got no
better results. The only thing that seems to improve the stability of
keepalive connections is setting threads and servers to the moon for
example:
```
<IfModule mpm_event_module>
ThreadsPerChild 50
StartServers 120
ServerLimit 120
MinSpareThreads 6000
MaxSpareThreads 6000
MaxRequestWorkers 6000
MaxConnectionsPerChild 0
</IfModule>
```
However it stills throws more errors (~8k) than 2.4.6 on the first set
of settings. This problem occurs only when using keepalive. There are
no errors when using ab without -k option, although speed is lower.  I
can replicate this issue on newest httpd build from source (2.4.57).
What is causing this difference of behavior? How can I achieve
performance from 2.4.6 on 2.4.37 / 2.4.57 without throwing much more
resources at httpd? These errors seems to be a root cause of a problem
we have with 502 errors thrown from downstream reverse proxy server
when httpd kills keepalive connection prematurely and the proxy is
trying to use this connection.

Below results of ab tests:

Centos 7 VM:
```
ab -k -t 900 -c 1000 -n 1000000 http://centos/
This is ApacheBench, Version 2.3 <$Revision: 1430300 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 10.1.3.3 (be patient)
Completed 100000 requests
Completed 200000 requests
Completed 300000 requests
Completed 400000 requests
Completed 500000 requests
Completed 600000 requests
Completed 700000 requests
Completed 800000 requests
Completed 900000 requests
Completed 1000000 requests
Finished 1000000 requests


Server Software:        Apache/2.4.6
Server Hostname:        10.1.3.3
Server Port:            80

Document Path:          /
Document Length:        7620 bytes

Concurrency Level:      1000
Time taken for tests:   15.285 seconds
Complete requests:      1000000
Failed requests:        67
  (Connect: 0, Receive: 0, Length: 67, Exceptions: 0)
Write errors:           0
Keep-Alive requests:    990567
Total transferred:      7919057974 bytes
HTML transferred:       7619489460 bytes
Requests per second:    65422.95 [#/sec] (mean)
Time per request:       15.285 [ms] (mean)
Time per request:       0.015 [ms] (mean, across all concurrent requests)
Transfer rate:          505945.41 [Kbytes/sec] received

Connection Times (ms)
             min  mean[+/-sd] median   max
Connect:        0    0   4.7      0    1042
Processing:     3   15  16.8     13     467
Waiting:        0   14  16.2     13     433
Total:          3   15  17.7     13    1081

Percentage of the requests served within a certain time (ms)
 50%     13
 66%     15
 75%     16
 80%     17
 90%     21
 95%     23
 98%     32
 99%     44
100%   1081 (longest request)
```

Rocky 8 VM:
```
ab -k -t 900 -c 1000 -n 1000000 http://rocky/
This is ApacheBench, Version 2.3 <$Revision: 1430300 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 10.1.3.11 (be patient)
Completed 100000 requests
Completed 200000 requests
Completed 300000 requests
Completed 400000 requests
Completed 500000 requests
Completed 600000 requests
Completed 700000 requests
Completed 800000 requests
Completed 900000 requests
Completed 1000000 requests
Finished 1000000 requests


Server Software:        Apache/2.4.37
Server Hostname:        10.1.3.11
Server Port:            80

Document Path:          /
Document Length:        7620 bytes

Concurrency Level:      1000
Time taken for tests:   19.101 seconds
Complete requests:      1000000
Failed requests:        93159
  (Connect: 0, Receive: 0, Length: 85029, Exceptions: 8130)
Write errors:           0
Keep-Alive requests:    912753
Total transferred:      7248228337 bytes
HTML transferred:       6973694460 bytes
Requests per second:    52352.12 [#/sec] (mean)
Time per request:       19.101 [ms] (mean)
Time per request:       0.019 [ms] (mean, across all concurrent requests)
Transfer rate:          370566.51 [Kbytes/sec] received

Connection Times (ms)
             min  mean[+/-sd] median   max
Connect:        0    4  54.4      0    3022
Processing:     0   15  20.6     13    1400
Waiting:        0   13  17.7     12    1400
Total:          0   19  59.0     13    3048

Percentage of the requests served within a certain time (ms)
 50%     13
 66%     15
 75%     17
 80%     19
 90%     24
 95%     41
 98%     53
 99%    106
100%   3048 (longest request)
```

---
Mateusz Kempski

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx
For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx


[Index of Archives]     [Open SSH Users]     [Linux ACPI]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Squid]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux