How would you do it? with wget, the only way of having it crawl through websites, is to recurse... isn't it? I tried screwing around, and the best I came up with was this: >#!/bin/bash >log="/var/log/squid3/access.log" > >while (true); do > echo "reading started: `date`, log file: $log" > sudo tail -n 80 $log | grep -P "/200 [0-9]+ GET" | grep "text/html" | awk '{print $7}' | wget -q -rp -nd -l 1 --delete-after -i - > sleep 5 > echo >done It's not so clean... On Tue, Oct 5, 2010 at 11:51 AM, John Doe <jdmls@xxxxxxxxx> wrote: > > From: flaviane athayde <flavianeathayde@xxxxxxxxx> > > > I try to put a shell script that read the Squid log, and use it to run > > wget with "-r -l1 -p" flag, but it also get its on pages, making a > > infinit loop, and I can't resolve it. > > Why recurse? > If you take your list from the log files, you will get all accessed files > already... no? > > JD > > >