Or use an alternative: ufdbGuard.
ufdbGuard is a URL filter for Squid that has a much easier
configuration file than the Squid ACLs and additional
configuration files.
ufdbGuard is also multithreaded and very fast.
And a tip: if you are really serious about blocking
anything, you should also block 'proxy sites' (i.e. sites
used to circumvent URL filters).
-Marcus
Amos Jeffries wrote:
Muhammad Sharfuddin wrote:
On Mon, 2010-03-22 at 08:47 +0100, Marcello Romani wrote:
Muhammad Sharfuddin ha scritto:
On Mon, 2010-03-22 at 19:27 +1300, Amos Jeffries wrote:
Thanks list for help.
restarting squid is not a solution, I noticed only after 20 minutes
after restarting, squid started consuming/eating CPU again.
On Wed, 2010-03-17 at 19:54 +1100, Ivan . wrote:
you might want to check out this thread
http://www.mail-archive.com/squid-users@xxxxxxxxxxxxxxx/msg56216.html
Neither I installed any package.. i.e not checked
On Wed, 2010-03-17 at 05:27 -0700, George Herbert wrote:
or install the Google malloc library and recompile Squid to
use it instead of default gcc malloc.
On Wed, 2010-03-17 at 15:01 +0200, Henrik K wrote:
If the system regex is issue, wouldn't it be better/simpler to just
compile
with PCRE? (LDFLAGS="-lpcreposix -lpcre"). It doesn't leak and as
a bonus
makes your REs faster.
Nor I re-compiled Squid, as I have to use binary/rpm version of squid
that shipped with the Distro I am using
issue resolved via removing acl that blocked almost 60K urls/domains
commenting following worked
##acl porn_deny url_regex "/etc/squid/domains.deny"
##http_access deny porn_deny
so how can I deny illegal contents/website ?
If those were actually domain names...
they are both urls and domain
* use "dstdomain" type instead of regex.
ok nice suggestion
Optimize order of ACLs so do most rejections as soon as possible
with fastest match types.
>>
I think its optimized, as the rule(squeezing cpu) is the first rule in
squid.conf
That's the exact opposite of "optimizing" as the cpu-consuming rule
is _always_ executed.
First rules should be non-cpu consuming (i.e. non-regexp) and should
block most of the traffic, leaving the cpu-consuming ones at the
bottom, ralrely executed.
If you don't mind sharing your squid.conf access lines we can work
through optimizing with you.
I posted squid.conf when I start this thread/topic, but I have no issue
posting it again ;)
I think he meant the list of blocked sites / url
its 112K after compression, am I allowed to post/attach such a big
file ?
The mailing list will drop all attachments.
squid.conf:
acl myFTP port 20 21
acl ftp_ipes src "/etc/squid/ftp_ipes.txt"
http_access allow ftp_ipes myFTP
The most optimal form of that line is:
acl myFTP proto FTP
http_access allow myFTP ftp_ipes
NP: Checking the protocol is faster than checking a whole list of IPs or
list of ports.
http_access deny myFTP
Since you only have two network IP ranges that might be possibly allowed
after the regex checks it's a good idea to start the entire process by
blocking the vast range of IPs which are never going to be allowed:
acl vip src "/etc/squid/vip_ipes.txt"
acl mynet src "/etc/squid/allowed_ipes.txt"
http_access deny !vip !mynet
#### this is the acl eating CPU #####
acl porn_deny url_regex "/etc/squid/domains.deny"
http_access deny porn_deny
###############################
acl vip src "/etc/squid/vip_ipes.txt"
http_access allow vip
acl entweb url_regex "/etc/squid/entwebsites.txt"
http_access deny entweb
Doing the same process to entwebsites.txt that was done to domains.deny
file will stop this one becoming a second CPU waste.
acl mynet src "/etc/squid/allowed_ipes.txt"
http_access allow mynet
This is the basic process for reducing a large list of regex down to an
optimal set of ACL tests....
What you can do to start with is separate all the domain-only lines from
the real regex patterns:
grep -E "^([\^]?[htpf]://)?[a-z0-9\.]+(/?\$?)$"
/etc/squid/domains.deny >dstdomain.deny
grep -v -E "^([\^]?[htpf]://)?[a-z0-9\.]+(/?\$?)$"
/etc/squid/domains.deny >url_regex.deny
... check the output of those two files. Don't trust my 2-second pattern
creation.
You will also need to strip any "^", "$", "http://" and "/" bits off the
dstdomain patterns.
When thats done see if there are any domains you can wildcard in the
dstdomain list. Loading the result into squid.conf may produce WARNING
lines about other duplicates that can also be removed. I'll call the ACL
using this file "stopDomains" in the following example.
For the other file with ones where URL still needs a full pattern match,
... split that to create another three files:
1) dstdomains where the domain is part of the pattern. I'll call this
"regexDomains" in the following example.
2) the full URL regex patterns with domains in (1). I'll call this
"regexUrls" in the example below.
3) regex patterns where domain name does not matter to the match.
I'll call that "regexPaths".
When thats done, change your config to make your CPU expensive lines:
acl porn_deny url_regex "/etc/squid/domains.deny"
http_access deny porn_deny
change into these:
# A
acl stopDomains dstdomain "/etc/squid/dstdomain.deny"
http_access deny stopDomains
#B
acl regexDomains dstdomain "/etc/squid/dstdomain.regexDomains"
acl regexUrls url_regex -i "/etc/squid/regex.urls"
http_access deny regexDomains regexUrls
#C
acl regexPaths urlpath -i "/etc/squid/regex.paths"
http_access deny regexPaths
As you can see regex is not done unless it really has to be done.
At "A" the domains which don't have to use regex at all get blocked
very fast with little CPU usage.
At "B" the domains get checked and only the ones which might actually
patch get a regex done to them.
At "C" we have no choice so a regex is done as before. But (a) the list
should now be very small and not use much CPU, and (b) most of the
blocked domains are already blocked.
Amos