hi there
I've been looking at squid to provide me with a content filtering proxy
that is publically accessible - all be it with access control. The idea is
that my no-cache proxy - housed in a data center - is used by my household
and friends and family with children whom they would rather didn't
accidently or deliberatly access sites with certain content.
In concept, I'm aware of the difficulties of content filtering, but I've
come to the conclution that the main show stopper for this sort of setup
is bandwidth. Each household configures their DSL router to proxy thought
this squid proxy, meaning that each household's bandwidth usage will add
to the bandwidth usage of the proxy server.
One way around this would be to have a whitelist of domains (bbc.co.uk,
wikipedia.org) for which squid would "forward" the http request straight
to the destination servers, re-writing the tcp headers so that the
response from the destination would go straight back to the client, thus
saving a vast amount of bandwidth at the squid proxy level. In effect,
the squid proxy would only come into play when the requested URL is not in
the whitelist, saving precious processing power and bandwidth.
Having looked around I've come across many items of software such as
redirectors that plug into squid (squirm, squidGuard ...) that have
whitelist features, but the requests still get passed back to squid to
fetch the site. DansGuardian (pre-squid-url-and-content-processor) also
has whitelist features, but still relies on squid to fetch whitelisted
sites.
I know it's possible (and perhaps written in stone in an RFC) to have the
client maintain a proxy exclusion list, but that would be unmanageble in
this sort of setup.
Is there anything out there that i've missed, either an obscure squid
patch, or a tool hidden away somewhere that could do what's described
above?
thanks for your time
Jack