Thanks Alex, this is very helpful.
Another interesting fact is that I can't reproduce this issue if squid has no other traffic except my testing requests. But it's easy to reproduce when server has other traffic.
The problem is that with other traffic I can't provide the whole log file with debug ALL,7 enabled because it has other requests.
So I tried to select only parts related to my test request (this is ALL,7):
On Tue, Nov 28, 2017 at 7:32 AM, Alex Rousskov <rousskov@xxxxxxxxxxxxxxxxxxxxxxx> wrote:
On 11/27/2017 05:19 PM, Ivan Larionov wrote:
> I see retries only when squid config has 2 parents. If I comment out
> everything related to "newproxy" I can't reproduce this behavior anymore.
The posted logs are not detailed enough to confirm or deny that IMO, but
I suspect that you are dealing with at least one bug.
> https://wiki.squid-cache.org/SquidFaq/InnerWorkings#When_ does_Squid_re-forward_a_ client_request.3F
>
>> Squid does not try to re-forward a request if at least one of the following conditions is true:
>>
>> The number of forwarding attempts exceeded forward_max_tries. For
>> example, if you set forward_max_tries to 1 (one), then no requests
>> will be re-forwarded.
AFAICT, there is an off-by-one bug in Squid that violates the above:
> if (n_tries > Config.forward_max_tries)
> return false;
The n_tries counter is incremented before Squid makes a request
forwarding attempt. With n_tries and Config.forward_max_tries both set
to 1, the quoted FwdState::checkRetry() code will not prevent
re-forwarding. There is a similar problem in FwdState::reforward(). This
reasoning needs confirmation/testing.
Please note that simply changing the ">" operator to ">=" may break
other things in a difficult-to-detect-by-simple-tests ways. The correct
fix may be more complex than it looks and may involve making policy
decisions regarding forward_max_tries meaning. The best fix would remove
checkRetry() and reforward() duplication. This code is difficult to work
with; many related code names are misleading.
>> Squid has no alternative destinations to try. Please note that
>> alternative destinations may include multiple next hop IP addresses
>> and multiple peers.
The fact that Squid sends two requests to the same peer with only one
peer address selected suggests that Squid is retrying a failed
persistent connection rather than re-forwarding after receiving a bad
response. Again, the logs are not detailed enough to distinguish the two
cases. I can only see that a single peer/destination address was
selected (not two), which is correct/expected behavior. I cannot see
what happened next with sufficient detail.
Going forward, you have several options, including:
A. Post a link to compressed ALL,7+ logs to confirm bug(s).
B. Fix the broken condition(s) in FwdState. See above.
HTH,
Alex.
> 2017/11/27 15:53:40.542| 5,2| TcpAcceptor.cc(220) doAccept: New connection on FD 15
> 2017/11/27 15:53:40.542| 5,2| TcpAcceptor.cc(295) acceptNext: connection
> on local=0.0.0.0:3128 <http://0.0.0.0:3128> remote=[::] FD 15 flags=9
> 2017/11/27 15:53:40.543| 11,2| client_side.cc(2372) parseHttpRequest:
> HTTP Client local=127.0.0.1:3128 <http://127.0.0.1:3128>
> remote=127.0.0.1:53798 <http://127.0.0.1:53798> FD 45 flags=1
> 2017/11/27 15:53:40.543| 11,2| client_side.cc(2373) parseHttpRequest:
> HTTP Client REQUEST:
> ---------
> GET http://HOST:12345/ HTTP/1.1
> Host: HOST:12345
> User-Agent: curl/7.51.0
> Accept: */*
> Proxy-Connection: Keep-Alive
>
>
> ----------
> 2017/11/27 15:53:40.543| 85,2| client_side_request.cc(745)
> clientAccessCheckDone: The request GET http://HOST:12345/ is ALLOWED;
> last ACL checked: localhost
> 2017/11/27 15:53:40.543| 85,2| client_side_request.cc(721)
> clientAccessCheck2: No adapted_http_access configuration. default: ALLOW
> 2017/11/27 15:53:40.543| 85,2| client_side_request.cc(745)
> clientAccessCheckDone: The request GET http://HOST:12345/ is ALLOWED;
> last ACL checked: localhost
> 2017/11/27 15:53:40.543| 17,2| FwdState.cc(133) FwdState: Forwarding
> client request local=127.0.0.1:3128 <http://127.0.0.1:3128>
> remote=127.0.0.1:53798 <http://127.0.0.1:53798> FD 45 flags=1,
> url="" href="http://HOST:12345/" rel="noreferrer" target="_blank">http://HOST:12345/
> 2017/11/27 15:53:40.543| 44,2| peer_select.cc(258) peerSelectDnsPaths:
> Find IP destination for: http://HOST:12345/' via 127.0.0.1
> 2017/11/27 15:53:40.543| 44,2| peer_select.cc(280) peerSelectDnsPaths:
> Found sources for 'http://HOST:12345/'
> 2017/11/27 15:53:40.543| 44,2| peer_select.cc(281) peerSelectDnsPaths:
> always_direct = DENIED
> 2017/11/27 15:53:40.543| 44,2| peer_select.cc(282) peerSelectDnsPaths:
> never_direct = ALLOWED
> 2017/11/27 15:53:40.543| 44,2| peer_select.cc(292) peerSelectDnsPaths:
> cache_peer = local=127.0.0.3 remote=127.0.0.1:18070
> <http://127.0.0.1:18070> flags=1
> 2017/11/27 15:53:40.543| 44,2| peer_select.cc(295) peerSelectDnsPaths:
> timedout = 0
> 2017/11/27 15:53:40.543| 11,2| http.cc(2229) sendRequest: HTTP Server
> local=127.0.0.3:57091 <http://127.0.0.3:57091> remote=127.0.0.1:18070
> <http://127.0.0.1:18070> FD 40 flags=1
> 2017/11/27 15:53:40.543| 11,2| http.cc(2230) sendRequest: HTTP Server
> REQUEST:
> ---------
> GET http://HOST:12345/ HTTP/1.1
> User-Agent: curl/7.51.0
> Accept: */*
> Host: HOST:12345
> Cache-Control: max-age=259200
> Connection: keep-alive
>
>
> ----------
>
> [SKIPPED 40 seconds until parent closes TCP connection with FIN,ACK]
>
> 2017/11/27 15:54:20.627| 11,2| http.cc(1299) continueAfterParsingHeader:
> WARNING: HTTP: Invalid Response: No object data received for
> http://HOST:12345/ AKA HOST/
> 2017/11/27 15:54:20.627| 17,2| FwdState.cc(655)
> handleUnregisteredServerEnd: self=0x3e31838*2 err=0x409b338
> http://HOST:12345/
> 2017/11/27 15:54:20.627| 11,2| http.cc(2229) sendRequest: HTTP Server
> local=127.0.0.3:41355 <http://127.0.0.3:41355> remote=127.0.0.1:18070
> <http://127.0.0.1:18070> FD 40 flags=1
> 2017/11/27 15:54:20.627| 11,2| http.cc(2230) sendRequest: HTTP Server
> REQUEST:
> ---------
> GET http://HOST:12345/ HTTP/1.1
> User-Agent: curl/7.51.0
> Accept: */*
> Host: HOST:12345
> Cache-Control: max-age=259200
> Connection: keep-alive
>
>
> ----------
>
> [SKIPPED 40 seconds again until parent closes TCP connection with FIN,ACK]
>
> 2017/11/27 15:55:00.728| ctx: enter level 0: 'http://HOST:12345/'
> 2017/11/27 15:55:00.728| 11,2| http.cc(719) processReplyHeader: HTTP
> Server local=127.0.0.3:41355 <http://127.0.0.3:41355>
> remote=127.0.0.1:18070 <http://127.0.0.1:18070> FD 40 flags=1
> 2017/11/27 15:55:00.728| 11,2| http.cc(720) processReplyHeader: HTTP
> Server REPLY:
> ---------
> HTTP/1.0 502 Bad Gateway
> Cache-Control: no-cache
> Connection: close
> Content-Type: text/html
>
> <html><body><h1>502 Bad Gateway</h1>
> The server returned an invalid or incomplete response.
> </body></html>
>
> ----------
> 2017/11/27 15:55:00.728| ctx: exit level 0
> 2017/11/27 15:55:00.728| 20,2| store.cc(996) checkCachable:
> StoreEntry::checkCachable: NO: not cachable
> 2017/11/27 15:55:00.728| 20,2| store.cc(996) checkCachable:
> StoreEntry::checkCachable: NO: not cachable
> 2017/11/27 15:55:00.728| 88,2| client_side_reply.cc(2073)
> processReplyAccessResult: The reply for GET http://HOST:12345/ is
> ALLOWED, because it matched (access_log stdio:/var/log/squid/access.log > HTTP Client local=127.0.0.1:3128 <http://127.0.0.1:3128>
> line)
> 2017/11/27 15:55:00.728| 11,2| client_side.cc(1409) sendStartOfMessage:
> remote=127.0.0.1:53798 <http://127.0.0.1:53798> FD 45 flags=1
> 2017/11/27 15:55:00.728| 11,2| client_side.cc(1410) sendStartOfMessage:
> HTTP Client REPLY:
> ---------
> HTTP/1.1 502 Bad Gateway
> Date: Mon, 27 Nov 2017 23:54:20 GMT
> Cache-Control: no-cache
> Content-Type: text/html
> X-Cache: MISS from ip-172-23-18-130
> X-Cache-Lookup: MISS from ip-172-23-18-130:3128
> Transfer-Encoding: chunked
> Connection: keep-alive
>
>
> ----------
With best regards, Ivan Larionov.
_______________________________________________ squid-users mailing list squid-users@xxxxxxxxxxxxxxxxxxxxx http://lists.squid-cache.org/listinfo/squid-users