Search squid archive

Re: Extracting selected data from logfile

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello all,

Thank you Chris for the suggestion. It helped enormously. I have extracted the data I was looking for by using the following:

tail -n 5000 access.log | grep "403" | awk '{print $1}' | uniq -d > file.txt

Best regards
Frog.


----- Original Message -----
From: "Chris Robertson" <crobertson@xxxxxx>
To: squid-users@xxxxxxxxxxxxxxx
Sent: Thursday, 19 March, 2009 21:37:25 GMT +00:00 GMT Britain, Ireland, Portugal
Subject: Re:  Extracting selected data from logfile

Frog wrote:
> Hello All,
>
> Hopefully someone may be able to assist me. 
>
> I have Squid setup here as a reverse proxy. I have logging configured using the following settings in squid.conf:
>
> logformat combined %>a %ui %un [%tl] "%rm %ru HTTP/%rv" %Hs %<st "%{Referer}>h" "%{User-Agent}>h" %Ss:%Sh
> access_log /var/log/squid/access.log combined
>
> To block certain bots and bad user agents I have the following:
>
> acl badbrowsers browser "/etc/squid/badbrowsers.conf"
> http_access deny badbrowsers
>
> The http_access deny returns a 403 to a visitor that meets the criteria in badbrowsers.conf and this works perfectly. But I would like to take this one step further. I would like to build a blacklist in real time if possible of IP addresses that have been served a 403 error.
>
> Unfortunately my knowledge of most of the popular scripting languages is non-existent so I was wondering if something like a redirector could be configured to meet my needs?
>
> I have looked at fail2ban however it doesn't seem to parse my log files even if I change the squid log format to common.
>
> Basically I am wondering if there is a way to parse the logfile to append to a new file any IP address that was served a 403.
>   

Something like...

tail -n 5000 /path/to/access.log |grep "HTTP/[^"]*\" 403" |awk '{print $1}

...run from the command line should (on my GNU/Linux machine) search the 
last 5000 lines (tail -n 5000) of the file at /path/to/access.log for 
the string "HTTP/" followed by any number of characters that are NOT a 
double quote, followed by a double quote, a space, and the string "403" 
(grep "HTTP...).  The first column from any lines with a matching 
pattern will be printed (awk '{print$1}).

This is in no way tested, and obviously does not append to a file or run 
automatically.

> Thank you in advance for any pointers.
>
> Frog..
>   

Chris


[Index of Archives]     [Linux Audio Users]     [Samba]     [Big List of Linux Books]     [Linux USB]     [Yosemite News]

  Powered by Linux