On Tue, 14 Dec 2010 17:24:01 +0100, David Gubler <dg@xxxxxxxxxx> wrote: > I'm experimenting with Squid 3.0 as a reverse proxy. Currently, there I'd suggest going to 3.1 then. reverse-proxy had a bit of an improvement there. > are two Squids running on the same machine, one for HTTP and one for > HTTPS (Squid must use HTTPS for the connection to our web server if and > only if the user did use HTTPS to contact Squid. I couldn't find another > way to do this except using two Squids on the same machine, please tell > me if there is another way. But for now, I can live with that.) > > Users can also contact our web server directly, bypassing Squid (for now > at least). > > Our Java web application needs to know the originating client IP address > (for GeoIP and the like). Squid puts it into the X-Forward-For header, > so far so good. Be careful, very very careful. ESPN is the example of the month for doing this badly. Their site refuses to open for anyone browsing from a host with local proxy installed. > > Unfortunately, anyone bypassing Squid could also set an X-Forward-For > header, so it cannot be trusted. Therefore, I need a way to authenticate > Squid to our Apache server. > > I could configure the Squid's IP address on Apache. But this is > undesirable, because Squid is running on EC2, its IP may change, and > further EC2 instances can come and go. The ONLY way to trust its contents is to verify that the listed content is correct, individually IP by IP starting with the machine directly connecting to supply it. If you can't track exactly where the proxy *is* in cyberspace then you cannot trust anything sent. XFF ACL tests will accept any ACL criteria that version of Squid can take. This sounds like you need a src type ACL with the domain name. Due to the way src works current Squid will need to be reconfigured after each IP change. If you are interested it should not be too hard to create a src_volatile type ACL which does not cache any IPs and does DNS lookups on every test (think 10k packets per second though). Note that the public cloud infrastructure has been well and truely hacked already and infected machines appear to roam freely there nowdays. So trusting unknown IPs in those spaces is a risky business. > > The method I would prefer is another HTTP header that contains a secret, > which is added by Squid when the request is sent to our Apache. I could > check for the presence of the Secret and the X-Forwarded-For header, and > if both are fine I know that I can trust the IP-address in > X-Forwarded-For. I know this isn't bullet-proof in the cryptographical > sense, but if someone can intercept the communication between Squid and > our Apache, he is most likely able to spoof TCP-Connections anyway. Better, Squid can send Basic auth login credentials in the Proxy_Authentication: header. squid-3.2 adds Negotiate auth protocol to this for more secure logins. set the login= option on your cache_peer linkage between the Squid and Apache. > > Unfortunately, I have tried header_replace, request_header_access and > header_access, none of these options seems to be able to add a new HTTP > header. Is there really no way to do this without using complicated and > slow icap/ecap stuff? > > Thanks! > > David > > PS. If anyone is curious, here is some dirty stuff for Apache I came up > with. > > # This is an ultra-evil hack to get the IP address from X-Forwarded-For > # into Tomcat and the Apache log file, but only > # if the request comes from one of our proxy servers (ip address > # whitelisted by adding a file) > RewriteRule .* - [E=MY_REMOTE_ADDR:%{REMOTE_ADDR}] > RewriteCond /somepath/proxies/%{REMOTE_ADDR} -f > RewriteCond %{HTTP:X-Forwarded-For} > "^[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}$" > RewriteRule .* - [E=MY_REMOTE_ADDR:%{HTTP:X-Forwarded-For}] > RewriteRule .* - [E=JK_REMOTE_ADDR:%{ENV:MY_REMOTE_ADDR}] Noting that XFF contains a ', ' delimited list. These rules may not work as intended. > LogFormat "%{MY_REMOTE_ADDR}e %l %u %t \"%r\" %>s %b %D \"%{Referer}i\" > \"%{User-Agent}i\"" combinedwithdurationproxyaware > CustomLog /somepath/access.log combinedwithdurationproxyaware > ErrorLog /somepath/error.log NP: when using a proxy a large portion of the traffic will never even reach the web server. At this point only the squid logs are a true record of the visitor traffic. Use them. If you have log processors that depend on the Apache format use the built in "common" logformat type for Squid access.log's. Squid provides syslog facility to push log lines out of the cloud proxy and back to a central server for processing. Squid-3.2 is coming with the infrastructure extensions needed to do this via several other methods (UDP, TCP, or daemon straight to DB or accounting system) and with longer log lines than syslog can handle. Amos