On 8/01/2016 9:48 a.m., Jason Haar wrote: > On 08/01/16 01:56, Marcus Kool wrote: >> Can you explain what the huge number of regexes is used for ? > malware urls. I'm scraping them from publicly available sources like > phishtank, malwaredomains.com. Ironically, they don't need to be regexes > - but squid only has a "url_regex" acl type - so regex it is (can't use > dstdomain because we want to block "http://good.site/bad.url" - not all > of "good.site") > But you do want to block all of http://good.site/bad\.url.* right? Otherwise the malware can get around the protection trivially just by adding a meaningless suffix to it. With all the scraping are you also filtering for duplicates and reducing multiple URLs in one doman down to fewer entries? Amos _______________________________________________ squid-users mailing list squid-users@xxxxxxxxxxxxxxxxxxxxx http://lists.squid-cache.org/listinfo/squid-users