Transparent deployment VS web services behind DNS load balancing

Denis Roy <denis.roy@xxxxxxxxx> · Mon, 25 Sep 2023 12:35:36 -0400

My installation is fairly simple: I run Squid 5.8 in transparent mode, on a pF based firewall (FreeBDS 14.0) .

I intercept both HTTP 80, and HTTPS 443. Splicing the exceptions I have in a whitelist, bumping everything else. Simple.
This is a relatively recent deployment, and it has been working well as far as web browser experience is concerned. Nonetheless, I have observed a certain amount of 409s sharing similarities (more on that later). Rest assured, I have made 100% certain my clients and Squid use the same resolver (Unbound), installed on the same box with a fairly basic configuration.

When I observe the 409s I am getting, they all share the same similarities: the original client request was from an application or OS related task,  using  DNS records with very low TTL. 5 minutes or less, often 2 minutes.  I could easily identify the vast majority of these domains as being load balanced with DNS solutions like Azure Traffic Manager, and Akamai DNS.

Now, this make sense: a thread on the client may essentially intiate a long running task that will last a couple of minutes (more than the TTL), during which it may actually establish a few connections without calling the gethostbyname function, resulting in squid detecting a forgery attempt since it will be unable to validate the dst IP match the intended destination domain. Essentially, creating "false positives'', and dropping legitimate traffic.

I have searched a lot, and the only reliable way to completely solve this issue in a transparent deployment has been to implement a number of IP lists for such services (Azure Cloud, Azure Front Door, AWS, Akamai and such), bypassing squid completely based on the destination IP address.

I'd be interested to hear what other approaches there might be. Some package maintainers have chosen to drop the header check altogether ( https://github.com/NethServer/dev/issues/5348 ).  I believe a better approach could be to just validate that the SNI of the TLS Client Hello match the certificate obtained from the remote web server, perform the usual certificate validation (is it trusted, valid, etc), and not rely so much on the DNS check which can be expected to fail at times given how DNS load balancing is ubiquitous with native cloud solutions and large CDNs.  But implementing this change would require serious development, which I am completely unable to take care of.

Hence, what others have been doing? I want my web security gateway to remain transparent, and not implement PAC files.

Thank you.

_______________________________________________
squid-users mailing list
squid-users@xxxxxxxxxxxxxxxxxxxxx
https://lists.squid-cache.org/listinfo/squid-users