Search squid archive

Re: Caching issue with http_port when running in transparent mode

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Amos Jeffries wrote:

> On 29/05/2012 6:12 p.m., Hans Musil wrote:
> > Amos Jeffries wrote:
> >> On 29.05.2012 08:13, Eliezer Croitoru wrote:
> >>> hey there Hans,
> >>>
> >>> are you serving squid on the same machine as the gateway is?(wasnt
> >>> sure about the DNAT).
> >>> your problem is not directly related to squid but to the way that tcp
> >>> and browsers works.
> >>> for every connection that the client browser uses exist a tcp windows
> >>> that stays alive for a period of time after the page was served.
> >>> this will cause to all the connections that was served using port
> >>> 3128 to still exist for i think 5 till 10 more minutes or whatever is
> >>> your tcp stack settings.
> >>
> >> While that is true for the TCP details I think HTTP connection 
> >> behaviour is why that matters. For the TCP timeouts closure to start 
> >> happening HTTP has to first stop using the connection.
> >>
> >> iptables NAT only affects SYN packets (ie new connections). So any 
> >> existing TCP connections made by HTTP WILL continue to operate 
> >> despite any changes to NAT rules.
> >>
> >> HTTP persistent connections, CONNECT tunnels and HTTP 
> >> "streaming"/large objects have no fixed lifetime and several minutes 
> >> for idle timeout. It is quite common to see client TCP connections 
> >> lasting whole hours or days with HTTP traffic flow throughout.
> >>
> >>>
> >>> On 28/05/2012 22:34, Hans Musil wrote:
> >>>> Hi,
> >>>>
> >>>> my box is running on Debian Sqeeze, which uses SQUID version 
> >>>> 2.7.STABLE9, but my problem also seems to affect SQUID version 3.1.
> >>>>
> >>>> These are the importend lines from my squid.conf:
> >>>>
> >>>> http_port 3128 transparent
> >>>> http_port 3129 transparent
> >>>> url_rewrite_program /etc/squid/url_rewrite.php
> >>>>
> >>>>
> >>>> First, I did configure my Linux iptables like this:
> >>>>
> >>>> # Generated by iptables-save v1.4.8 on Mon May 28 21:04:09 2012
> >>>> *nat
> >>>> :PREROUTING ACCEPT [0:0]
> >>>> :POSTROUTING ACCEPT [0:0]
> >>>> :OUTPUT ACCEPT [0:0]
> >>>> -A PREROUTING -i eth1 -p tcp -m tcp --dport 80 -j DNAT 
> >>>> --to-destination 10.17.0.1:3128
> >>>> COMMIT
> >>>>
> >>>> and everything works fine.
> >>>>
> >>>> But when I change the redirect port in the iptables settings from 
> >>>> 3128 to 3129, Squid behaves strange: My URL rewrite program still 
> >>>> gets send myport=3128, althought there is definitely no more 
> >>>> request on this port, but only on 3129. This only affects HTTP 
> >>>> domains that already have been requested before, i.e. with 
> >>>> redirection to port 3128, and it works fine again when I do a 
> >>>> force-reload on my browser. Also, things turn well when waiting 
> >>>> some minutes.
> >>>>
> >>>> I suppose there is some strange caching inside Squid that maps the 
> >>>> HTTP domain to an incoming port.
> >>
> >> No. There is only an active TCP connection. Multiple HTTP request can 
> >> arrive on the connection long after you start sending unrelated new 
> >> connections+requests through other ports.
> >>
> >>
> >> What your helper was passed is the details about the request Squid 
> >> received. It arrived on a TCP connection which was accepted through 
> >> Squid port 3128. The fact that you changed the kernel settings after 
> >> that connection was setup and operating is irrelevant.
> >>
> >>
> >> URL-rewriting is a form of NAT on the URL, but with far worse 
> >> side-effects than IP-layer NAT and is often a sign of major design 
> >> mistakes somewhere in the network. Why do you have to re-write in the 
> >> first place? perhapse we could point you at a simpler more standards 
> >> compliant setup.
> >>
> >> Amos
> >>
> > Thanks Amos. This makes things even clearer. Actually, I'd say that my 
> > problem is solved with the help of both of you. But well, let's have a 
> > look on my design.
> >
> > My goal is to build up an access control mechanism for my client 
> > machines to the internet. As long as a user has not yet logged in, his 
> > client box should be completely cut off the internet, not only HTTP.
> >
> > The login is done by a web interface. This is where I redirect the URL 
> > rewriting for any web traffic. After the user has logged in, the 
> > client's HTTP packets will be DNATed to the other squid port in order 
> > to be regularly proxied. I need the HTTP proxy for logging my users' 
> > HTTP requests.
> >
> > Since the users' client machines are out of my control, it is 
> > important for me that they don't need any special configuration, 
> > That's why the squid must run in transparent mode.
> 
> Okay. As expected a design problem. The huge problem with transparent 
> intercept is that the browser is 100% unaware that the proxy exists. As 
> far as it is concerned the re-written splash page or redirect response 
> is the actual response to somebody elses domain name (google or your 
> bank for example). It has zero reason to think that a new TCP connection 
> is needed for followup requests. Just because the server of that page 
> replied Connection:close is no reason to expect Squid to pass the 
> closure on to the client (quite the reverse, Squid will go out of its 
> way to keep client connections open and re-used).
> 
> 
> To fit in with your existing config that would be:
> 
>   acl port3128 myportname 3128
>   deny_info http://your-login.example.com/ port3128
>   http_access deny port3128
> 
> The full details and some other tricks can be found at 
> http://wiki.squid-cache.org/ConfigExamples/Portal/Splash
> 
> This still hits the DNAT problems. I would suggest finding an 
> external_acl_type helper that accesses whatever database your login 
> script is recording client logins with. Using that as the ACL to deny / 
> bounce new clients to the login page. With that design you can authorize 
> a client on their initial request and continue using the connection 
> afterwards.
> 
> NP: I recenty posted to the list a version of the external_acl_type 
> helper I use myself for exactly this type of portal setup.
> 
> Amos

Amos, I'm back. Thanks for your last posting.

Your trick with acl, deny_info and http_access was a big help.

As far as I understand, the external_acl_type helper needs to decide every few seconds whether a client is logged in or not. With some hundreds of clients, this means hundreds of database lookups per second. That's what I wanted to avoid by flipping the squid port when a user logs in or out, respectively. This way, I only have one iptables rule instead of multiple DB lookups.

As far as the DNAT problem, I consider to simply run a "contrack -D" with appropriate -s and -d options from my login/logout script. 

Hans
-- 
Empfehlen Sie GMX DSL Ihren Freunden und Bekannten und wir
belohnen Sie mit bis zu 50,- Euro! https://freundschaftswerbung.gmx.de


[Index of Archives]     [Linux Audio Users]     [Samba]     [Big List of Linux Books]     [Linux USB]     [Yosemite News]

  Powered by Linux