On 18/05/2013 6:23 p.m., Helmut Hullen wrote:
Hallo, Amos,
Du meintest am 18.05.13:
SG has numerous problems which caused it not to do what it's
supposed to, including that "emergency" mode thing. Here are some
things to consider:
1) a BIG blacklist is overhyped - when I had a good look at our
requirements, there was only a small percentage of those websites
we actually wanted to block, the rest were either squatting
websites or non-existent, or not relevant. Squid could blacklist
(eg ACL DENY) those websites natively with a minimum of fuss.
May be - it does a good job even with these unnecessary entries.
If the list is that badly out of date it will also be *missing* a
great deal of entries.
Yes - may be. But updating the list is a really simple job.
2) SG has not been updated for 4 or 5 years, if that's your latest
version, you are still out of date.
I can't see a big need for updating. Software really doesn't need
changes ("updates") every month or so.
For regular software yes. But security software which has set itself
out as enumerating badness/goodness for a control method needs
constant updates.
May be - but "squidguard" does a really simple job: it looks into a list
of not allowed domains and URLs and then decides wether to allow or to
deny. That job doesn't need "constant updates".
Unfortunately it does so by forcing all the compications into Squid.
In order for SG to do that "really simple job". Squid is required to:
* manage a group of sub-processes, including all error handling when
they fail or hang.
* generate and process requests and responses in a protocol to
communicate with those sub-processes
* schedule client request handling around the delay from external
processing, including recovery on SG errors
* clone the HTTP request and perform a sub-request when redirected-to
URL is presented by SG.
Much better to have Squid doing the simple ACL task and drop all of the
above complications.
Not to mention that Markus fed back a lot of the ufdbGuard improvements
into Squid-3.2 and we now have ACLs which operate reasonably fast over
big lists of regex. Not that using big lists of regex is a great idea
anyway.
More to the point, you will not find much help now. or anyone to
fix it even if you could prove it's a bug.
"That depends!" - I know many colleagues who use "squidguard" since
years; the program doesn't need much help.
During which time a lot of things have progressed. Squid has gained a
lt of ACL types, better regex handling, better memory management, and
an external ACL helpers interface (which most installations of SG
should really be using).
Which brings me back to my question of what SG was being used for. If
it is something which the current Squid are capable of doing without
SG then you maybe can gain better traffic performance simply by
removing SG from the software chain. Like csn233 found it may be
worth it.
The squidguard job is working with a really big blacklist. And working
with some specialized ACLs.
Which apart from the list files, is all based on received information
sent to it by Squid.
I know "squid" can do this job too - and I maintain a schoolserver which
uses many of these possibilities of "squid". But then some other people
has to maintain the blacklist. That's no job for the administrator in
the school.
You are the first to mention that change of job.
The proposal was to:
* make Squid load the blacklist
* remove SG from the software chain
* watch response time improve ?
Nowhere in that sequence does it require any change of who is creating
the list.
At most the administrator may need to run a tool to convert from some
strange format to one Squid can load. (FWIW: both squidblacklists.org
and Shalla provide lists which have already been converted to
Squid-compatible formats).
"better traffic performance" may be a criteria, but (p.e.) blocking porn
URLs is (in schools) a criteria too.
Teachers have to look at "legal protection for children and young
persons" too.
I'm just talking about shifting the checks to the place where they can
be tested most effecctively. Not removing them.
Squid already has the information about user login, IP address, MAC
address, URL. No doubt Squid is already doing allow/deny access based on
login and IP which users are trying to get access with. Making Squid
load the blocklist and usie it in the http_access controls is relatively
simple.
So what is left for SG to do? in most cases you will find the answer
is "nothing".
Note that we have not even got near discussing the content of those
"regex" lists. I've seen many SquidGuard installations where the
rationale for holding onto SG was that squid "can't handle this many
regex". Listing 5 million domain names in a file with some 1% having a
"/something" path tacked on the end does not make it a regex list.
** split the fie into domains and domain+path entries. Suddenly you
have a small file of url_regex, a small file of dstdom_regex and a long
list of dstdomain ... which Squid can handle.
Amos