Hello all, Thank you Chris for the suggestion. It helped enormously. I have extracted the data I was looking for by using the following: tail -n 5000 access.log | grep "403" | awk '{print $1}' | uniq -d > file.txt Best regards Frog. ----- Original Message ----- From: "Chris Robertson" <crobertson@xxxxxx> To: squid-users@xxxxxxxxxxxxxxx Sent: Thursday, 19 March, 2009 21:37:25 GMT +00:00 GMT Britain, Ireland, Portugal Subject: Re: Extracting selected data from logfile Frog wrote: > Hello All, > > Hopefully someone may be able to assist me. > > I have Squid setup here as a reverse proxy. I have logging configured using the following settings in squid.conf: > > logformat combined %>a %ui %un [%tl] "%rm %ru HTTP/%rv" %Hs %<st "%{Referer}>h" "%{User-Agent}>h" %Ss:%Sh > access_log /var/log/squid/access.log combined > > To block certain bots and bad user agents I have the following: > > acl badbrowsers browser "/etc/squid/badbrowsers.conf" > http_access deny badbrowsers > > The http_access deny returns a 403 to a visitor that meets the criteria in badbrowsers.conf and this works perfectly. But I would like to take this one step further. I would like to build a blacklist in real time if possible of IP addresses that have been served a 403 error. > > Unfortunately my knowledge of most of the popular scripting languages is non-existent so I was wondering if something like a redirector could be configured to meet my needs? > > I have looked at fail2ban however it doesn't seem to parse my log files even if I change the squid log format to common. > > Basically I am wondering if there is a way to parse the logfile to append to a new file any IP address that was served a 403. > Something like... tail -n 5000 /path/to/access.log |grep "HTTP/[^"]*\" 403" |awk '{print $1} ...run from the command line should (on my GNU/Linux machine) search the last 5000 lines (tail -n 5000) of the file at /path/to/access.log for the string "HTTP/" followed by any number of characters that are NOT a double quote, followed by a double quote, a space, and the string "403" (grep "HTTP...). The first column from any lines with a matching pattern will be printed (awk '{print$1}). This is in no way tested, and obviously does not append to a file or run automatically. > Thank you in advance for any pointers. > > Frog.. > Chris