Search squid archive

Re: FreeBSD Squid timeout issue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dave wrote:
Hi,
   Thanks for your reply. The following is the ip and abbreviated msg:
(reason: 554 5.7.1 Service unavailable; Client host [65.24.5.137] blocked using dnsbl-1.uceprotect.net; To my squid issue, if aufs is less intensive and more efficient i'll definitely switch over to it. As for your suggestion about splitting in to multiple files I believe the version i have can do this, it has multiple acl statements for the safe_ports definition. My issue though is there's like 15000+ lines in this file, and investigating some like 500 are duplicates. I'd rather not have to manually go through this and do the split, is there a way i can split based on the dst, dstdomain, or url_regexp you referenced?

I just used the following commands, pulled off most of the job in a few minutes. The remainders that got left as regex was small. There are some that are duplicates of the domain-only list, but that can be dealt with later.


# Pull out the IPs
grep -v -E "[a-z]+" porn | sort -u >porn.ipa

# copy everything else into a temp file
grep -v -E "[a-z]+" porn | sort -u >temp.1

# pull out lines with only domain name
grep -E "^([0-9a-z\-]\.)+[a-z]+$" temp.1 | sort -u >temp.d

# pull out everthing without a domain name into another temp
grep -v -E "^([0-9a-z\-]\.)+[a-z]+$" temp.1 | sort -u >temp.2
rm temp.1

# pull out lines that are domain/ or domain<space> and drop the end
grep -E "^([0-9a-z\-]\.)+[a-z]+[\/ ]$" temp.2 | sed s/\\/// | sed s/\\ // | sort -u >>temp.d

# leave the rest as regex patterns
grep -v -E "^([0-9a-z\-]\.)+[a-z]+[\/ ]$" temp.2 | sort -u >porn.regex
rm temp.2

# sort the just-domains and make sure there are no duplicate.
cat temp.d | sort -u > porn.domains
rm temp.d

Amos

[Index of Archives]     [Linux Audio Users]     [Samba]     [Big List of Linux Books]     [Linux USB]     [Yosemite News]

  Powered by Linux