Re: Accelerator mode, select peer form request destination ip (feature request?)

Amos Jeffries <squid3@xxxxxxxxxxxxx> · Wed, 04 Nov 2009 00:12:22 +1300

Justo Alonso wrote:
Hi Amos !

On Mon, Nov 2, 2009 at 11:26 PM, Amos Jeffries <squid3@xxxxxxxxxxxxx> wrote:
You seem to have mixed up your view of the information passed versus the
actions taken and what virtual hosting actually does.

On Mon, 2 Nov 2009 21:22:33 +0100, Justo Alonso <justo.alonso@xxxxxxxxx>
wrote:
Hi !
    I'm trying to setup an apache & squid in accelerator mode
configuration.
You start by indicating that you are trying to configure a reverse proxy.

   I have the apache in Listen *:80 .. with many virtualhosts and many
namevirtualhosts (namevirtualhost *:80 too). I have the squid at
http_port 8080.
Reverse-proxy do not work this way. I wonder if the many other meanings of
the word "accelerator" have confused you.

A reverse proxy is software talking to the client software pretending to
be a web server, but sourcing the replies over network from a real web
server elsewhere.

The client will type http://example.com/ into their browser address bar.
In your config that will take them directly to the apache.

To make a reverse-proxy useful you move apache to some other port and
place the proxy listening on port 80. Which then is configured to source
the data from apache on its other port.

So that when clients enter http://example.com/ into their browser address
bar it will take them directly to the proxy.

It's ok .. the port it's not important. In my configuration, the
firewall nat the 80 port to 8080 in the sites with reverse proxy:

client to www.example.com:80 -> fw nat www.internal.example.com:8080
-> squid cache in 8080 -> apache in 80

Alert!! your Squid is now a NAT interception proxy.

IP addresses as viewed by Squid are always:
  src:   client.IP with random port
  dst:   www.internal.example.com with port 8080

So, the client get http://www.example.com in your browser and the
squid receive the request in 8080 port .. The port it's not important.

   And now I want squid to make the request to the apache to same
destination ip that client requested .... example

client request -> squid 10.0.0.1:8080 -> apache 10.0.0.1:80
client rquest -> squid 10.0.0.2:8080 -> apache 10.0.0.2:80 ....
You then indicate that you want Squid to make actual physical TCP links to
different Apache. based on what the receiving Squid IP was. This has
nothing to do with virtual hosting.

Yes, if you have namevirtualhosting. If you have many virtualhost and
many namevirtualhost (the machine with apache and squid has many
interface aliases)

With namevirtualhost and virtualhost apache selects the correct
configuration for the site with the destination ip and port and the
servername/serveralias directives

Due to NAT above destination IP is erased and replaced with a constant 
value.

This requirement is met by adding a cache_peer directive for each back-end
Apache server. Then using cache_peer_access and ACL of the "myip" type
limiting the requests passed to each peer to be those received on the
matching input IP.

Yes, but this makes the configuration very complex. You need to
configure one cache_peer by interface and acls for cache_peer .. if
you have about 150 interfaces per machine .. the posibility to make a
mistake it's higher !

I want to configure one http_port and a cache_peer by virtualhost ...
I need a global configuration ... all client requests redirected to
the same destination ip to diferent port.
You then say that all these apache listening addresses are actually the
same machine.

cache_peer is designed to be a TCL link to a _single_ backend software.
Virtual hosting has nothing to do with it. This is so low down as to be
almost wire-level stuff.

ummm ... but .. if the same-dst-ip option is defined on this
cache_peer, we can change the cache_peer configuration or create a
dynamic cache_peer for attend this request setting the host of this
cache_peer to the destination ip (the ip that the squid receive the
request)

Reading between the lines .... I guess that you actually want Squid to
pass the right information such that incoming requests are handled by the
right virtualhost inside Apache correct?

That is done correctly by having the right accelerator setup. Using
http_port settings vhost, vport and defaultsite to alter the Host: header
(virtualhost name) as applicable. The cache_peer forcedomain= option is
also available, but that is for force-sending only a single virtualhost
name to the Apache, opposite of what you want.

By setting squid to listen on port 80 the clients software will be sending
Squid the correct Host: header information as needed by apache to find the
virtualhost.

All you need to do is configure "http_port 80 accel vhost" and one single
cache_peer line pointing at Apache, and it works.
http://wiki.squid-cache.org/ConfigExamples/Reverse/VirtualHosting

But my apache listen on diferent interfaces and have namevirtualhost
configuration, then one cache_peer don't work fine.

Reading the documentation I can't find about this, and I'm trying to
add new option to cache_peer (same-dst-ip) .. if this boolean option
is set then this cache_peer don't get the host and get the destination
ip from request ... and open a connection to it on cache_peer port.

What think you about this ?? comments ?? Maybe should I send this
question to squid-dev list ?

thanks in advance,
justo

Your description of 'same-dst-ip' seems to me to be you attempting to make
squid a semi-transparent proxy instead of accelerator by opening security
vulnerability CVE-2009-0801 for reverse-proxy (accelerator) configurations
as well as interception-proxy configurations. This is a very bad idea.

The CVE-2009-0801 doesn't apply with this configuration ... It's the

Client can connect to any of Squid IPs and forge Host: info of any 
apache IP. In your case it just has very limited impact.
With dst-ip option that will make squid ignore cache_peer explicit 
destination and loop back on itself.

same case as if the client connects directly to the web server. Squid
doesn't selects the cache_peer by Host header .. selects by
destination ip of the request.

Above you said NAT was involved.
 "client to www.example.com:80 -> fw nat www.internal.example.com:8080"

Therefore destination IP will _always_ be IP www.internal.example.com
because NAT erased the real destination IP and replaced it.

This breaks your option as described for accelerators.

If you find the basic accelerator config I outlined above still is not
workable for your specific needs then please get back to me on the details
of why you think this option might still be needed. I'm working on fixes
for the CVE and would like to make sure any such option fits with the
solution and does not open new vulnerabilities.

Amos

justo

Given that your Squid is really a NAT interception proxy you can drop 
the pretense of configuring it as an accelerator...

  http_port 8080 transparent

  acl toapaches dst 10.0.0.0/8
  http_access allow toapaches

Will do exactly what you are asking to be done. Squid will decode the 
Host: header, look it up in DNS and send the request to the relevant 
server IP and port as encoded by the client.

Now all you need do is make the DNS resolver tell Squid what internal 
apache IP for all the namevirtualhosts that apache services (DNS views). 
You can even have multiple IPs returned for each name to make seamless 
failover between apaches when some go down.

Amos
--
Please be using
  Current Stable Squid 2.7.STABLE7 or 3.0.STABLE20
  Current Beta Squid 3.1.0.14