Re: Re: can squid follow 301/302 without passing the 301/302 back to the client?

Amos Jeffries <squid3@xxxxxxxxxxxxx> · Sun, 20 Jan 2013 13:15:39 +1300

This is a terrible idea. Details below.

On 20/01/2013 9:44 a.m., carteriii wrote:
I put together the textual diagram below to help describe to someone what I
am trying to.  I thought I'd include it here in case it gives someone else
any ideas.  The greater-than or less-than symbols indicate the direction
that information is flowing.

These first two lines show what normally happens today. The client requests
a specific page.  That request goes to Squid (acting as a reverse proxy)
which in turn passes that same request onto the appropriate Server.  The
"Server" then responds with a 301/302 and the location that should be used
instead.  This response is passed through Squid to the client.

Client > > (req: http://page) > > Squid > > (req: http://page) > > > >
Server

Client < < (resp: 301/location) < < Squid < < (resp: 301/location) < < <
Server

This next diagram shows what I would like to have happen.  The first line is
exactly the same.  The second & third lines show that when the 301 response
from the server gets to Squid, Squid would turn around and request the new
"location" from another server.  Only when the response comes back from that
server will Squid return that response to the client.  The client is never
aware that a 301 or 302 occurred.

Meaning that the client is not aware what URL location it has been 
passed and needs to cache that response under. All the client has is the 
original request URL. So Squid has just performed a cache poisoning 
attack on the client, corrupting their future HTTP requests to that URL 
and any relative-URL snippets embedded in the object.

* In HTTP there is the likely possibility that the client is another 
proxy closer to the actual end-user.
 Resulting in more than just one end-user receiving the corrupted 
caches responses. Free attack amplification from one visitor to entire 
network.

* In HTTP there is the possibility that some third-party intermediary is 
who responded with 30x.
 For example a pay-as-you-go gateway which suddenly needs more payment 
will 30x redirect to the payment for even is http://google.com/ was 
requested. Innocent ISP doing the Right Thing(tm) with a 30x to separate 
itself from original requested site are suddenly seen as hijacking 
traffic in networks well beyond their gateway.

* In HTTP the server is influential in determining caching time for its 
response.
 Meaning that when an actual attack takes place using 30x to inject 
cache poison the corruption will stick around in cache long after the 
actual attacker disappeared. The effect is the same as replacing pages 
on some website with your own copies, but where the website host is 
completely unable to see the change or wipe the disk files back to 
uncorrupted copies.

So as you should be able to see, it is extremely unsafe and a Bad 
Idea(tm) to do this anywhere outside the end-users own browser cache.

Beyond the poisoning aspect you are also injecting into the network 
towards the client all the other annoying problems which are documented 
for URL-rewrite. There is a fairly long laundry list of reasons why 
automatic 30x following and other forms of URL-rewrite should not be 
done by a shared proxy or intermediary device. The above poisoning 
consequences are just the #1 most important.

NP: The URL-rewrite and store-URL feature in Squid are only possible 
because there is a) only one request involved with the re-write at any 
one time and b) the Squid administrator is directly in charge of the URL 
merging, not some third party (possible attacker) supplied 30x 
information. Even so they are still dangerous features and can lead to 
cache poisoning unless used sparingly and targeted with great precision.

Amos