Search squid archive

Re: Squid returns 500 on rapidly changing DNS records

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



OK, so...we have prepared to make some deeper debug on this issue but now it disappeared (for now).
We are almost sure that the problem is on the upstream side (DNS or SF Cloud Provider) - not Squid side.

So thank you for your help.

Cheers,
Maciek Leks

pt., 17 lut 2023 o 19:55 Alex Rousskov <rousskov@xxxxxxxxxxxxxxxxxxxxxxx> napisał(a):
On 2/17/23 13:24, Maciej Leks wrote:
> I understand your last question but it's hard me to answer clearly -
> yes/no.

Yes, it is a difficult question to answer. Logging DNS traffic with
tcpdump might help, but it does require some non-trivial work to
correlate everything. For now, without that or similar additional
information, it is easy for us to assume that you are just seeing the
effects of DNS responses without IP addresses.

There is another way to confirm or reject a bug claim. Share a
compressed ALL,9 cache.log while reproducing the problem with as few
transactions as possible (privately if needed).
https://wiki.squid-cache.org/SquidFaq/BugReporting#debugging-a-single-transaction


> Instead of this let me show you this log:

> I suppose the A/AAAA records should already be in the DNS, but the
> last resort squid gave up.

Why do you think that the IP addresses should already be (or appear) in
the parent Squid IP cache during these logged transactions?


Cheers,

Alex.


> 17/Feb/2023:17:16:21 +0100 1676650581.827    103 100.121.11.70
> 100.121.11.70 NONE_NONE/500 0 CONNECT
> scala-module-5515.scratch.lightning.force.com:443 - HIER_NONE/- -
> 17/Feb/2023:17:16:22 +0100 1676650582.297      0 100.121.11.70
> 100.121.11.70 NONE_NONE/500 0 CONNECT
> scala-module-5515.scratch.lightning.force.com:443 - HIER_NONE/- -
> 17/Feb/2023:17:16:22 +0100 1676650582.893      0 100.121.11.70
> 100.121.11.70 NONE_NONE/500 0 CONNECT
> scala-module-5515.scratch.lightning.force.com:443 - HIER_NONE/- -
> 17/Feb/2023:17:16:23 +0100 1676650583.625      0 100.121.11.70
> 100.121.11.70 NONE_NONE/500 0 CONNECT
> scala-module-5515.scratch.lightning.force.com:443 - HIER_NONE/- -
> 17/Feb/2023:17:16:23 +0100 1676650583.711      0 100.121.11.70
> 100.121.11.70 NONE_NONE/500 0 CONNECT
> scala-module-5515.scratch.lightning.force.com:443 - HIER_NONE/- -
> 17/Feb/2023:17:16:25 +0100 1676650585.073      0 100.121.11.70
> 100.121.11.70 NONE_NONE/500 0 CONNECT
> scala-module-5515.scratch.lightning.force.com:443 - HIER_NONE/- -
> 17/Feb/2023:17:16:25 +0100 1676650585.799      0 100.121.11.70
> 100.121.11.70 NONE_NONE/500 0 CONNECT
> scala-module-5515.scratch.lightning.force.com:443 - HIER_NONE/- -
> 17/Feb/2023:17:16:25 +0100 1676650585.882      0 100.121.11.70
> 100.121.11.70 NONE_NONE/500 0 CONNECT
> scala-module-5515.scratch.lightning.force.com:443 - HIER_NONE/- -
> 17/Feb/2023:17:16:26 +0100 1676650586.258      0 100.121.11.70
> 100.121.11.70 NONE_NONE/500 0 CONNECT
> scala-module-5515.scratch.lightning.force.com:443 - HIER_NONE/- -
> 17/Feb/2023:17:16:26 +0100 1676650586.995      0 100.121.11.70
> 100.121.11.70 NONE_NONE/500 0 CONNECT
> scala-module-5515.scratch.lightning.force.com:443 - HIER_NONE/- -
> 17/Feb/2023:17:16:28 +0100 1676650588.468      0 100.121.11.70
> 100.121.11.70 NONE_NONE/500 0 CONNECT
> scala-module-5515.scratch.lightning.force.com:443 - HIER_NONE/- -
> 17/Feb/2023:17:16:29 +0100 1676650589.159      0 100.121.11.70
> 100.121.11.70 NONE_NONE/500 0 CONNECT
> scala-module-5515.scratch.lightning.force.com:443 - HIER_NONE/- -
> 17/Feb/2023:17:16:29 +0100 1676650589.560      0 100.121.11.70
> 100.121.11.70 NONE_NONE/500 0 CONNECT
> scala-module-5515.scratch.lightning.force.com:443 - HIER_NONE/- -
> 17/Feb/2023:17:16:30 +0100 1676650590.319      0 100.121.11.70
> 100.121.11.70 NONE_NONE/500 0 CONNECT
> scala-module-5515.scratch.lightning.force.com:443 - HIER_NONE/- -
>
> This is the last resort parent squid and the 100.121.11.70 is our
> downstream squid (child). I understand it as follows, thanks to
> connect_retries set to 10 (here we have more than 10 tries) try to
> connect.



> pt., 17 lut 2023 o 17:46 Alex Rousskov
> <rousskov@xxxxxxxxxxxxxxxxxxxxxxx> napisał(a):
>>
>> On 2/17/23 09:31, Maciej Leks wrote:
>>
>>> We have being facing the 500 error codes in our squids (v5.7) when it
>>> comes to communicate with Salesforce DNS for ephemeral hosts
>>
>>>      A cs128-fra.fra.r.salesforce.com.              30s   85.222.154.211
>>>      A cs128-fra.fra.r.salesforce.com.              30s   160.8.252.83
>>>      A cs128-fra.fra.r.salesforce.com.              30s   85.222.152.83
>>
>> 30s TTLs will naturally limit the time Squid can cache these IPs for.
>>
>>
>>>  From time to time DNS answers only sending SOA record (no A,AAAA records)
>>
>> Squid will fail to forward traffic if Squid does not have the
>> destination IP address cached and its new DNS lookup returns no IPv4 or
>> IPv6 records. Moreover, I am not 100% sure -- needs checking/testing --
>> but Squid might cache that "negative" no-IPs result (see
>> negative_dns_ttl that defaults to 60 seconds).
>>
>> Do all 500 error cases correspond to DNS answers without IPs? In other
>> words, is there a 500 error case after a DNS answer with IPs (within 29
>> seconds from each other)?
>>
>> HTH,
>>
>> Alex.
>> P.S. Sorry for not answering your questions below directly. I do not
>> have the time to document what Squid debugging logs mean, and I
>> recommend sharing them without analysis (by non-developers).
>>
>>
>>> dig:
>>> IP address of sagittarius-oatmilk-722.scratch.my.salesforce.com is
>>> cs128.salesforce.com.
>>> cs128-fra.salesforce.com.
>>> cs128-fra.fra.r.salesforce.com.
>>> 160.8.255.83
>>> 160.8.249.83
>>> 85.222.154.211
>>> Failed to resolve sagittarius-oatmilk-722.scratch.my.salesforce.com
>>> Failed to resolve sagittarius-oatmilk-722.scratch.my.salesforce.com
>>> Failed to resolve sagittarius-oatmilk-722.scratch.my.salesforce.com
>>> Failed to resolve sagittarius-oatmilk-722.scratch.my.salesforce.com
>>> IP address of sagittarius-oatmilk-722.scratch.my.salesforce.com is
>>> cs128.salesforce.com.
>>> cs128-fra.salesforce.com.
>>> cs128-fra.fra.r.salesforce.com.
>>> 85.222.154.211
>>> 160.8.255.83
>>> 160.8.249.83
>>> IP address of sagittarius-oatmilk-722.scratch.my.salesforce.com is
>>> cs128.salesforce.com.
>>> cs128-fra.salesforce.com.
>>> cs128-fra.fra.r.salesforce.com.
>>> 160.8.253.83
>>>
>>> Squid outputs then:
>>> 16/Feb/2023:13:45:54 +0100 1676551554.420    107 100.113.24.84
>>> 100.113.24.84 NONE_NONE/500 0 CONNECT
>>> canal-trailblazer-1066.scratch.my.salesforce.com:443 - HIER_NONE/- - -
>>> - "/CN=gitlab-executor-stg.navio-services.opl-gcp.int"
>>> 16/Feb/2023:13:51:55 +0100 1676551915.973     82 100.113.25.178
>>> 100.113.25.178 NONE_NONE/500 0 CONNECT
>>> pumpkinspice-pisces-6588.scratch.my.salesforce.com:443 - HIER_NONE/- -
>>> - - "/CN=gitlab-executor-stg.navio-services.opl-gcp.int"
>>> 16/Feb/2023:13:52:44 +0100 1676551964.285     95 100.113.24.84
>>> 100.113.24.84 NONE_NONE/500 0 CONNECT
>>> trailblazer-builder-1043.scratch.file.force.com:443 - HIER_NONE/- - -
>>> Mozilla/5.0%20(X11;%20Linux%20x86_64)%20AppleWebKit/537.36%20(KHTML,%20like%20Gecko)%20HeadlessChrome/109.0.5414.74%20Safari/537.36
>>> "/CN=gitlab-executor-stg.navio-services.opl-gcp.int"
>>> 16/Feb/2023:13:56:11 +0100 1676552171.392     43 100.121.10.183
>>> 100.121.10.183 NONE_NONE/500 0 CONNECT
>>> pistachio-vente-806.scratch.my.salesforce.com:443 - HIER_NONE/- - - -
>>> "/CN=gitlab-executor-stg.navio-services.opl-gcp.int"
>>>
>>>
>>> Child squids get then (the same situation but different case):
>>> 17/Feb/2023:13:18:49 +0000 1676639929.989 88 100.121.10.104
>>> TCP_TUNNEL/500 0 CONNECT
>>> beans-goldengate-9524.scratch.my.salesforce.com:443 -
>>> ANY_OLD_PARENT/100.112.36.14 -
>>>
>>> Such a situation but not the same case from squid internals point of
>>> view I list below:
>>> 74,5| RequestParser.cc(377) doParse: request-line: retval 1:
>>> line={266, data=""> >>> baybridge-module-8155.scratch.my.salesforce.com:443 HTTP/1.1
>>> Host: baybridge-module-8155.scratch.my.salesforce.com:443
>>> 2023/02/15 15:44:24.223 kid1| 74,5| RequestParser.cc(379) doParse:
>>> request-line: url: baybridge-module-8155.scratch.my.salesforce.com:443
>>> 2023/02/15 15:44:24.223 kid1| 74,5| Parser.cc(192) grabMimeBlock: mime
>>> header (0-196) {Host:
>>> baybridge-module-8155.scratch.my.salesforce.com:443
>>> CONNECT baybridge-module-8155.scratch.my.salesforce.com:443 HTTP/1.1
>>> Host: baybridge-module-8155.scratch.my.salesforce.com:443
>>> Host: baybridge-module-8155.scratch.my.salesforce.com:443
>>> 2023/02/15 15:44:24.223 kid1| 23,3| Uri.cc(441) parse: Split URL
>>> 'baybridge-module-8155.scratch.my.salesforce.com:443' into proto='',
>>> host='baybridge-module-8155.scratch.my.salesforce.com', port='443',
>>> path=''
>>> 2023/02/15 15:44:24.223 kid1| 14,3| Address.cc(389) lookupHostIP:
>>> Given Non-IP 'baybridge-module-8155.scratch.my.salesforce.com': Name
>>> or service not known
>>> 2023/02/15 15:44:24.223 kid1| 33,5| Http1Server.cc(193)
>>> buildHttpRequest: normalize 1 Host header using
>>> baybridge-module-8155.scratch.my.salesforce.com:443
>>> 2023/02/15 15:44:24.224 kid1| 85,3| client_side_request.cc(637)
>>> hostHeaderVerify: validate
>>> host=baybridge-module-8155.scratch.my.salesforce.com, port=443,
>>> portStr=443
>>> 2023/02/15 15:44:24.224 kid1| 14,3| ipcache.cc(722)
>>> ipcache_gethostbyname: ipcache_gethostbyname:
>>> 'baybridge-module-8155.scratch.my.salesforce.com', flags=1
>>> 2023/02/15 15:44:24.224 kid1| 14,3| ipcache.cc(310) ipcacheRelease:
>>> ipcacheRelease: Releasing entry for
>>> 'baybridge-module-8155.scratch.my.salesforce.com'
>>> 2023/02/15 15:44:24.224 kid1| 14,3| Address.cc(389) lookupHostIP:
>>> Given Non-IP 'baybridge-module-8155.scratch.my.salesforce.com': Name
>>> or service not known
>>> 2023/02/15 15:44:24.224 kid1| 14,4| ipcache.cc(600)
>>> ipcache_nbgethostbyname:
>>> baybridge-module-8155.scratch.my.salesforce.com
>>> 2023/02/15 15:44:24.224 kid1| 14,3| Address.cc(389) lookupHostIP:
>>> Given Non-IP 'baybridge-module-8155.scratch.my.salesforce.com': Name
>>> or service not known
>>> 2023/02/15 15:44:24.224 kid1| 14,5| ipcache.cc(660)
>>> ipcache_nbgethostbyname_: ipcache_nbgethostbyname: MISS for
>>> 'baybridge-module-8155.scratch.my.salesforce.com'
>>> 2023/02/15 15:44:24.224 kid1| 78,3| dns_internal.cc(1788) idnsALookup:
>>> idnsALookup: buf is 65 bytes for
>>> baybridge-module-8155.scratch.my.salesforce.com, id = 0x9c4
>>> 2023/02/15 15:44:24.224 kid1| 78,3| dns_internal.cc(1724)
>>> idnsSendSlaveAAAAQuery: buf is 65 bytes for
>>> baybridge-module-8155.scratch.my.salesforce.com, id = 0x5b5e
>>> 2023/02/15 15:44:24.224 kid1| 28,3| DestinationIp.cc(78) match: can't
>>> yet compare 'to_localhost' ACL for
>>> baybridge-module-8155.scratch.my.salesforce.com
>>> 2023/02/15 15:44:24.224 kid1| 14,4| ipcache.cc(600)
>>> ipcache_nbgethostbyname:
>>> baybridge-module-8155.scratch.my.salesforce.com
>>> 2023/02/15 15:44:24.224 kid1| 14,3| Address.cc(389) lookupHostIP:
>>> Given Non-IP 'baybridge-module-8155.scratch.my.salesforce.com': Name
>>> or service not known
>>> 2023/02/15 15:44:24.224 kid1| 14,5| ipcache.cc(660)
>>> ipcache_nbgethostbyname_: ipcache_nbgethostbyname: MISS for
>>> 'baybridge-module-8155.scratch.my.salesforce.com'
>>> 2023/02/15 15:44:24.247 kid1| 14,3| ipcache.cc(456) latestError: DNS
>>> error while resolving baybridge-module-8155.scratch.my.salesforce.com:
>>> No DNS records
>>> 2023/02/15 15:44:24.248 kid1| 14,3| ipcache.cc(456) latestError: DNS
>>> error while resolving baybridge-module-8155.scratch.my.salesforce.com:
>>> No DNS records
>>> 2023/02/15 15:44:24.282 kid1| 14,3| ipcache.cc(477) ipcacheParse: 3
>>> answers for baybridge-module-8155.scratch.my.salesforce.com
>>> 2023/02/15 15:44:24.282 kid1| 14,3| ipcache.cc(576)
>>> ipcacheHandleReply: done with
>>> baybridge-module-8155.scratch.my.salesforce.com: [no cached IPs]
>>> 2023/02/15 15:44:24.282 kid1| 14,3| ipcache.cc(477) ipcacheParse: 3
>>> answers for baybridge-module-8155.scratch.my.salesforce.com
>>> 2023/02/15 15:44:24.282 kid1| 14,3| ipcache.cc(576)
>>> ipcacheHandleReply: done with
>>> baybridge-module-8155.scratch.my.salesforce.com: [no cached IPs]
>>> 2023/02/15 15:44:24.282 kid1| 14,3| ipcache.cc(310) ipcacheRelease:
>>> ipcacheRelease: Releasing entry for
>>> 'baybridge-module-8155.scratch.my.salesforce.com'
>>> 2023/02/15 15:44:24.282 kid1| 14,3| ipcache.cc(722)
>>> ipcache_gethostbyname: ipcache_gethostbyname:
>>> 'baybridge-module-8155.scratch.my.salesforce.com', flags=1
>>> 2023/02/15 15:44:24.282 kid1| 28,3| DomainData.cc(110) match:
>>> aclMatchDomainList: checking
>>> 'baybridge-module-8155.scratch.my.salesforce.com'
>>> 2023/02/15 15:44:24.282 kid1| 28,3| DomainData.cc(115) match:
>>> aclMatchDomainList: 'baybridge-module-8155.scratch.my.salesforce.com'
>>> NOT found
>>> 2023/02/15 15:44:24.283 kid1| 28,3| DomainData.cc(110) match:
>>> aclMatchDomainList: checking
>>> 'baybridge-module-8155.scratch.my.salesforce.com'
>>> 2023/02/15 15:44:24.283 kid1| 28,3| DomainData.cc(115) match:
>>> aclMatchDomainList: 'baybridge-module-8155.scratch.my.salesforce.com'
>>> found
>>> 2023/02/15 15:44:24.283 kid1| 85,2| client_side_request.cc(745)
>>> clientAccessCheckDone: The request CONNECT
>>> baybridge-module-8155.scratch.my.salesforce.com:443 is ALLOWED; last
>>> ACL checked: gitlab-executor-stg-dst
>>> 2023/02/15 15:44:24.284 kid1| 85,2| client_side_request.cc(745)
>>> clientAccessCheckDone: The request CONNECT
>>> baybridge-module-8155.scratch.my.salesforce.com:443 is ALLOWED; last
>>> ACL checked: gitlab-executor-stg-dst
>>> 2023/02/15 15:44:24.284 kid1| 85,4| client_side_request.cc(1514)
>>> processRequest: CONNECT
>>> baybridge-module-8155.scratch.my.salesforce.com:443
>>> 2023/02/15 15:44:24.284 kid1| 26,3| tunnel.cc(1140) tunnelStart:
>>> CONNECT baybridge-module-8155.scratch.my.salesforce.com:443 HTTP/1.1
>>> 2023/02/15 15:44:24.284 kid1| 44,3| peer_select.cc(612) selectMore:
>>> CONNECT baybridge-module-8155.scratch.my.salesforce.com
>>> 2023/02/15 15:44:24.284 kid1| 44,3| peer_select.cc(1098) addSelection:
>>> adding HIER_DIRECT#baybridge-module-8155.scratch.my.salesforce.com
>>> 2023/02/15 15:44:24.284 kid1| 44,2| peer_select.cc(460)
>>> resolveSelected: Find IP destination for:
>>> baybridge-module-8155.scratch.my.salesforce.com:443' via
>>> baybridge-module-8155.scratch.my.salesforce.com
>>> 2023/02/15 15:44:24.284 kid1| 14,4| ipcache.cc(607) nbgethostbyname:
>>> baybridge-module-8155.scratch.my.salesforce.com
>>> 2023/02/15 15:44:24.284 kid1| 14,3| Address.cc(389) lookupHostIP:
>>> Given Non-IP 'baybridge-module-8155.scratch.my.salesforce.com': Name
>>> or service not known
>>> 2023/02/15 15:44:24.284 kid1| 14,4| ipcache.cc(647)
>>> ipcache_nbgethostbyname_: ipcache_nbgethostbyname: HIT for
>>> 'baybridge-module-8155.scratch.my.salesforce.com'
>>> 2023/02/15 15:44:24.284 kid1| 44,2| peer_select.cc(479)
>>> resolveSelected: PeerSelector4664913 found all 0 destinations for
>>> baybridge-module-8155.scratch.my.salesforce.com:443
>>> 2023/02/15 15:44:24.284 kid1| 4,3| errorpage.cc(1249)
>>> compileLegacyCode: %U -->
>>> 'https://baybridge-module-8155.scratch.my.salesforce.com/*'
>>> 2023/02/15 15:44:24.284 kid1| 4,3| errorpage.cc(1249)
>>> compileLegacyCode: %U -->
>>> 'https://baybridge-module-8155.scratch.my.salesforce.com/*'
>>> 2023/02/15 15:44:24.284 kid1| 4,3| errorpage.cc(1249)
>>> compileLegacyCode: %W -->
>>> '?subject=CacheErrorInfo%20-%20ERR_CANNOT_FORWARD&body=CacheHost%3A%20opl-egress-proxy-vm-31vr%0D%0AErrPage%3A%20ERR_CANNOT_FORWARD%0D%0AErr%3A%20%5Bnone%5D%0D%0ATimeStamp%3A%20Wed,%2015%20Feb%202023%2014%3A44%3A24%20GMT%0D%0A%0D%0AClientIP%3A%20100.121.10.148%0D%0A%0D%0AHTTP%20Request%3A%0D%0ACONNECT%20%20HTTP%2F1.1%0AVia%3A%201.1%20gitlab-proxy-6d445dbd8d-mxtqv%20(squid%2F5.7)%0D%0AX-Forwarded-For%3A%20100.113.24.224%0D%0ACache-Control%3A%20max-age%3D259200%0D%0AConnection%3A%20close%0D%0AHost%3A%20baybridge-module-8155.scratch.my.salesforce.com%3A443%0D%0A%0D%0A%0D%0A'
>>> 2023/02/15 15:44:24.286 kid1| 33,3| client_side_request.cc(272)
>>> ~ClientHttpRequest: httpRequestFree:
>>> baybridge-module-8155.scratch.my.salesforce.com:443
>>> 2023/02/15 15:44:24.286 kid1| 33,5| client_side.cc(385) logRequest:
>>> logging half-baked transaction:
>>> baybridge-module-8155.scratch.my.salesforce.com:443
>>>
>>> Our parent proxy settings for DNS:
>>> negative_dns_ttl 3 seconds  #if we set 1s it behaves the same
>>> dns_nameservers 8.8.8.8 1.1.1.1
>>>
>>> The question is:
>>> If you carefully look at the debug log you see:
>>> a. I(Squid) can't find IP for the name
>>> b. I've got 3 IPs from DNS
>>> c. OK, so I've got IP
>>> d. ...Oh, I cannot find any IP
>>> e. I can't find any destinations (what's that?)
>>> f. I'm outputing an error
>>>
>>> So, do you think it's ok or there is a bug?
>>>
>>> Rgds,
>>> Maciej Leks
>>> _______________________________________________
>>> squid-users mailing list
>>> squid-users@xxxxxxxxxxxxxxxxxxxxx
>>> http://lists.squid-cache.org/listinfo/squid-users
>>
>> _______________________________________________
>> squid-users mailing list
>> squid-users@xxxxxxxxxxxxxxxxxxxxx
>> http://lists.squid-cache.org/listinfo/squid-users
> _______________________________________________
> squid-users mailing list
> squid-users@xxxxxxxxxxxxxxxxxxxxx
> http://lists.squid-cache.org/listinfo/squid-users

_______________________________________________
squid-users mailing list
squid-users@xxxxxxxxxxxxxxxxxxxxx
http://lists.squid-cache.org/listinfo/squid-users
_______________________________________________
squid-users mailing list
squid-users@xxxxxxxxxxxxxxxxxxxxx
http://lists.squid-cache.org/listinfo/squid-users

[Index of Archives]     [Linux Audio Users]     [Samba]     [Big List of Linux Books]     [Linux USB]     [Yosemite News]

  Powered by Linux