Re: Managing user http bandwidth with squid cache

Amos Jeffries <squid3@xxxxxxxxxxxxx> · Wed, 17 Oct 2012 11:20:24 +1300

On 17.10.2012 04:47, Alan Dawson wrote:
Hi,

I'm at an educational establishment, with approx 2500 desktops.
We have had a restrictive web access policy implemented with
a web cache/filtering proxy appliance. User browsers are configured
by a PAC file and web proxy auto discovery.  They authenticate 
against
the appliance with NTLM

Okay.

We plan on changing that policy to something much less restrictive,
but one of the technical issues we are expecting is an increase in
web traffic usage.

Currently we use 60Mb/s at peak times ( with 97% of that being http
traffic ), with our network connection being rated at 100Mb/s

Traffic speed in HTTP is best calculated and measured in 
requests/second. You can flood a 10Gbps link with two or three requests, 
or fetch >100K req/sec on a 10Mbps NIC. You can also have 2500 desktops 
producing 1-2 req/sec combined, or all producing those 100k req/sec 
each.

Your existing appliance should be able to give you a measurement of how 
many req/sec it currently handles for average and peak rates traffic. 
That measure along with connections/second will be useful in estimating 
Squid requirements...

We'd like to manage the amount of bandwidth that we use at our site
connecting to high traffic sites like youtube/vimeo/bbc, so that 
there is
always capacity for critical web applications, for example online
examinations.

The filter/proxy appliance does not have any options for limiting 
bandwidth

One of the ways we are investigating would be to use a squid web 
cache and
delay_pools. We would try to identify high bandwidth/popular sites, 
and
either use a PAC file so clients chose the bandwidth restricting 
cache, or
use a cache chaining rule on the filter/proxy appliance, to pass 
requests
for particular sites to the bandwidth restricting cache.

If users connect to the squid cache directly we would authenticate 
using
Kerberos/NTLM for windows clients and Basic for others.

Does this approach seem valid ?

Sort of.

Definitely get away from NTLM. If you can start that migration now 
before even moving away from the appliance it will reduce the new 
changes being faced and simplify problem solving.

The ratio of appliance connection/sec rate to req/sec rate will tell 
you how efficiently the NTLM connections are currently being utilized (2 
requests for handshake, 1...N for transaction data). So take the 
connection/sec count and double it, then subtract that from the req/sec 
rate to see how much Kerberos will hypothetically face - its rough guess 
since not all handshakes are 2 requests and the req/sec profile *will* 
change.

Squid delay pools are an old feature. I advocate proper TOS marking for 
bandwidth limiting where possible - since it can better account for 
traffic outside of Squid and local network conditions as well.

What kind of resource would the squid cache require ( RAM/CPU ... )

Whatever you can afford. These days the average consumer grade hardware 
can run Squid to the order of 200 req/sec - with some networks using 
good hardware measuring it up to ~2000 req/sec (including 
authentication, and some large ACLs).

Amos