On Mon, 1 Jun 2009 17:29:18 -0700, Ray Van Dolson <rvandolson@xxxxxxxx> wrote: > Any suggestions on how to go about evaluating web traffic for > "cacheability"? I have access to a port that can see all the web > traffic in our company. > > I'd like to be able to gauge how many hits there are to common sites to > get a feel for how much bandwidth savings we could potentially gain by > implementing a company-wide web cache. Depending on how much you tune the config, you should expect 10% of web traffic to be a lower bound (no tuning) and 50% an upper bound. That is for HTTP traffic only, so the overall % is less depending on the non-HTTP going through your network. > > I suppose creative use of tcpdump could be used here (obviously not > catching https traffic), but maybe there's a more polished tool or some > slicker way to do this. The most reliable way to know is to setup a test proxy and start pushing a small amount of the traffic through it. The summary overview of Squid contains measure of % bandwidth that has been local HIT (saved from going external). The tools out there (www.ircache.net/cgi-bin/cacheability.py and redbot.org) are more spot-check tools for finding out why something particular isn't caching once the resource is known. If anyone knows of a stand-alone tool please speak up, I'm interested as well. Amos