Search squid archive

Re: bad regex is blocking the wrong sites

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 03 Oct 2011 14:42:43 -0700, devadmin wrote:
Hello Im new to blocking with squid, right now im using a bad site list and that works fine, blocks urls as it should, but Im also experimenting with the bad regex style blacklist because I see a lot of porn is still getting through, but the badregex is blocking farmvilla zynga content as well as AOL email! I would like to know why "gay" and "porn" would cause
aol and farmville to be blocked and any suggestions that might be


Welcome to the world of filtering. Just about every admin this planet has tried it at some point and none succeeded yet. Best advice is not to bother. Try other means. If you continue with this good luck.


helpful would be so very much appreciated, I have teenagers on the lan
and need to protect them from this garbage the best of my ability.

Protection begins with education and awareness. The form of "protection" you are attempting is akin to blindfolding them and tying them up in a closet. As soon as they move out of the sanitised zone you are building they will have to face more hardened peers and come off the worse because of it. Denial of access to information (bad experiences included) is a violation of human rights.

That said, I know there are places (certain countries and school systems) which mandate this kind of filtering. If you are operating inside one of those you will find it better practice to make extensive use of local whitelists and public blacklists. The public blacklists have professionals paid to make them correct and keep up with changes. It is more than a full time job keeping up with the thousands of new websites which appear every day.


heres the contents of the bad regex blacklist im using, just a single
line.

.*porn*.*

one entry. and this single entry causes all those sites/services and
more to be blocked. What am I doing wrong?

That regex matches the text "por", contained anywhere in the object being scanned.

 .*   -> zero or more of any character
 por  -> a 'p' followed by 'o' followed by 'r'
 n*   -> a sequence of _zero_ or more 'n'
 .*   -> zero or more of any character

ie. http://PORtal.facebook.com/

<snip>

acl manager proto cache_object
acl localhost src 127.0.0.1/32
acl to_localhost dst 127.0.0.0/8 0.0.0.0/32
acl localnet src 10.10.1.0/24 # RFC 1918 possible internal network
acl blacklist dstdomain "/etc/squid3/squid-block.acl"
#acl badregex url_regex -i "/etc/squid3/badregex.acl"

url_regex is not a great idea if you are writing the lists yourself. It matches the entire URL end-to-end including the query string portion and path. You want to have different word filters for each piece of the URL.


http_access deny blacklist
http_access deny badregex

The first step to using regex blocklists safely is to reduce the places where you are testing it.

At minimum add a whitelist:
  http_access deny !whitelistA blacklist
  http_access deny !whitelistB badregex


http_access allow manager localhost
http_access deny manager
http_access deny !Safe_ports
http_access deny CONNECT !SSL_ports
http_access allow localhost
http_access allow localnet
http_access deny all


Amos


[Index of Archives]     [Linux Audio Users]     [Samba]     [Big List of Linux Books]     [Linux USB]     [Yosemite News]

  Powered by Linux