On 17.10.2012 04:47, Alan Dawson wrote:
Hi,
I'm at an educational establishment, with approx 2500 desktops.
We have had a restrictive web access policy implemented with
a web cache/filtering proxy appliance. User browsers are configured
by a PAC file and web proxy auto discovery. They authenticate
against
the appliance with NTLM
Okay.
We plan on changing that policy to something much less restrictive,
but one of the technical issues we are expecting is an increase in
web traffic usage.
Currently we use 60Mb/s at peak times ( with 97% of that being http
traffic ), with our network connection being rated at 100Mb/s
Traffic speed in HTTP is best calculated and measured in
requests/second. You can flood a 10Gbps link with two or three requests,
or fetch >100K req/sec on a 10Mbps NIC. You can also have 2500 desktops
producing 1-2 req/sec combined, or all producing those 100k req/sec
each.
Your existing appliance should be able to give you a measurement of how
many req/sec it currently handles for average and peak rates traffic.
That measure along with connections/second will be useful in estimating
Squid requirements...
We'd like to manage the amount of bandwidth that we use at our site
connecting to high traffic sites like youtube/vimeo/bbc, so that
there is
always capacity for critical web applications, for example online
examinations.
The filter/proxy appliance does not have any options for limiting
bandwidth
One of the ways we are investigating would be to use a squid web
cache and
delay_pools. We would try to identify high bandwidth/popular sites,
and
either use a PAC file so clients chose the bandwidth restricting
cache, or
use a cache chaining rule on the filter/proxy appliance, to pass
requests
for particular sites to the bandwidth restricting cache.
If users connect to the squid cache directly we would authenticate
using
Kerberos/NTLM for windows clients and Basic for others.
Does this approach seem valid ?
Sort of.
Definitely get away from NTLM. If you can start that migration now
before even moving away from the appliance it will reduce the new
changes being faced and simplify problem solving.
The ratio of appliance connection/sec rate to req/sec rate will tell
you how efficiently the NTLM connections are currently being utilized (2
requests for handshake, 1...N for transaction data). So take the
connection/sec count and double it, then subtract that from the req/sec
rate to see how much Kerberos will hypothetically face - its rough guess
since not all handshakes are 2 requests and the req/sec profile *will*
change.
Squid delay pools are an old feature. I advocate proper TOS marking for
bandwidth limiting where possible - since it can better account for
traffic outside of Squid and local network conditions as well.
What kind of resource would the squid cache require ( RAM/CPU ... )
Whatever you can afford. These days the average consumer grade hardware
can run Squid to the order of 200 req/sec - with some networks using
good hardware measuring it up to ~2000 req/sec (including
authentication, and some large ACLs).
Amos