Re: mod_proxy: When does a backend be considered as failed?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 07/30/2016 04:37 PM, Yann Ylavic wrote:
On Sat, Jul 30, 2016 at 8:36 AM, dE <de.techno@xxxxxxxxx> wrote:
Yes, as you said I did try that before --
Your previous configuration had timeout=10 on the ProxyPass line, not
the BalancerMember one as expected.

BalancerMember balancer://localbalance/ http://[fc00::1:4]/ timeout=10
ProxyPass / balancer://localbalance/ failontimeout=on

I even tried the retry (set to 600) and forcerecovery parameter.
Really, without the timeout=10 on the ProxyPass (like the
configuration you mentioned in your previous message)?

Have you ever tried this in the tests that I'm doing (SIGSTOP apache)?
Actually yes, with that exact configuration:

<VirtualHost *:8080>
     ServerName localhost:8080
     ProxyPass / balancer://localbalance/ failontimeout=on
failonstatus=502 forcerecovery=off
     BalancerMember balancer://localbalance/ http://127.0.0.1:80/
timeout=10 retry=30
</VirtualHost>


1. The backend is initially not started:

$ time nc localhost 8080 <<\EOF
GET / HTTP/1.1
Host: localhost:8080

EOF
HTTP/1.1 503 Service Unavailable
Date: Sat, 30 Jul 2016 10:33:29 GMT
Server: Apache/2.4.24-dev (Unix) OpenSSL/1.0.2g
[...]
real    0m0.005s
user    0m0.000s
sys    0m0.000s

$ tail -3 error_log
[Sat Jul 30 12:33:29.725314 2016] [proxy:error] [pid 14463:tid
139636299003648] (111)Connection refused: AH00957: HTTP: attempt to
connect to 127.0.0.1:80 (127.0.0.1) failed
[Sat Jul 30 12:33:29.725392 2016] [proxy:error] [pid 14463:tid
139636299003648] AH00959: ap_proxy_connect_backend disabling worker
for (127.0.0.1) for 30s
[Sat Jul 30 12:33:29.725415 2016] [proxy_http:error] [pid 14463:tid
139636299003648] [client ::1:50006] AH01114: HTTP: failed to make
connection to backend: 127.0.0.1


2. The backend is still not started, ~6 seconds later (i.e. < 30s):
GET / HTTP/1.1
Host: localhost:8080

EOF
HTTP/1.1 503 Service Unavailable
Date: Sat, 30 Jul 2016 10:33:35 GMT
Server: Apache/2.4.24-dev (Unix) OpenSSL/1.0.2g
[..]
real    0m0.004s
user    0m0.000s
sys    0m0.000s

$ tail -1 error_log
[Sat Jul 30 12:33:35.574164 2016] [proxy_balancer:error] [pid
14463:tid 139636381173504] [client ::1:50010] AH01170:
balancer://localbalance: All workers are in error state


3. The backend is now started, ~15s later (i.e. still < 30s):

$ time nc localhost 8080 <<\EOF
GET / HTTP/1.1
Host: localhost:8080

EOF
HTTP/1.1 503 Service Unavailable
Date: Sat, 30 Jul 2016 10:33:44 GMT
Server: Apache/2.4.24-dev (Unix) OpenSSL/1.0.2g
[...]
real    0m0.004s
user    0m0.000s
sys    0m0.000s

$ tail -1 error_log
[Sat Jul 30 12:33:44.327913 2016] [proxy_balancer:error] [pid
14463:tid 139636282218240] [client ::1:50012] AH01170:
balancer://localbalance: All workers are in error state


4. The backend is still running normally, ~1mn later (i.e. > 30s):
$ time nc localhost 8080 <<\EOF
GET / HTTP/1.1
Host: localhost:8080

EOF
HTTP/1.1 200 OK
Date: Sat, 30 Jul 2016 10:34:40 GMT
Server: Apache
[...]
real    0m0.006s
user    0m0.004s
sys    0m0.000s


5. The backend is SIGSTOP-ed (no VM used, so "ServerLimit 1" and kill
-STOP the only child):
$ time nc localhost 8080 <<\EOF
GET / HTTP/1.1
Host: localhost:8080

EOF
HTTP/1.1 502 Proxy Error
Date: Sat, 30 Jul 2016 10:34:51 GMT
Server: Apache/2.4.24-dev (Unix) OpenSSL/1.0.2g
[...]
real    0m10.015s
user    0m0.000s
sys    0m0.004s

$ tail -4 error_log
[Sat Jul 30 12:35:01.355768 2016] [proxy_http:error] [pid 14463:tid
139636273825536] (70007)The timeout specified has expired: [client
::1:50018] AH01102: error reading status line from remote server
127.0.0.1:80
[Sat Jul 30 12:35:01.355849 2016] [proxy:error] [pid 14463:tid
139636273825536] [client ::1:50018] AH00898: Error reading from remote
server returned by /
[Sat Jul 30 12:35:01.355942 2016] [proxy_balancer:error] [pid
14463:tid 139636273825536] [client ::1:50018] AH01174:
balancer://localbalance: Forcing worker (http://127.0.0.1/) into error
state due to status code 502 matching 'failonstatus' balancer
parameter
[Sat Jul 30 12:35:01.355964 2016] [proxy_balancer:error] [pid
14463:tid 139636273825536] [client ::1:50018] AH02460:
balancer://localbalance: Forcing worker (http://127.0.0.1/) into error
state due to timeout and 'failontimeout' parameter being set


6. The backend is still SIGSTOP-ed, ~25s later (i.e. < 30s):
$ time nc localhost 8080 <<\EOF
GET / HTTP/1.1
Host: localhost:8080

EOF
HTTP/1.1 503 Service Unavailable
Date: Sat, 30 Jul 2016 10:35:26 GMT
Server: Apache/2.4.24-dev (Unix) OpenSSL/1.0.2g
[...]
real    0m0.004s
user    0m0.000s
sys    0m0.000s

$ tail -1 error_log
[Sat Jul 30 12:35:26.797383 2016] [proxy_balancer:error] [pid
14463:tid 139636265432832] [client ::1:50022] AH01170:
balancer://localbalance: All workers are in error state


So everything works as expected for me, timeouts are always 10s max,
and the worker never wakes up before the retry= period...

Regards,
Yann.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx
For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx

Got it! It works now. Thank you so much!

I just forgot a few parameters.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx
For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx




[Index of Archives]     [Open SSH Users]     [Linux ACPI]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Squid]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux