[Hotplug_sig] Re: PLM patch spider

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Thu, 23 Jun 2005, Dave Hansen wrote:

> On Mon, 2005-05-02 at 13:41 -0700, Judith Lebzelter wrote:
> > We have a cron job that will hit the patch directory once every three 
> > hours to check for new patches.  It also pulls a patch if it finds one.  
> > This is the same schedule we use for other kernel patches.
> 
> Is there a chance that you guys could update your PLM fetcher a little
> bit?  It likes to go looking for files that aren't actually present on
> my web server, which generates a fair amount of log output that I
> usually like to keep an eye on.  That isn't a big problem in and of
> itself, it just bloats the logs.
> 

Hi Dave,

Sorry about the logging.  It should not look for things that are not 
there, just finds the names of what is there in the specific directory 
it points at which is:  http://sr71.net/patches.  I set it to go down 2 
levels, looking for the patches.

> The user agent strings are now something like: "PLM/0.1" and
> "lwp-trivial/1.41".  Could they, perhaps, get a little bit more
> informative, like:
> 
> 	"PLM/0.1 patch spider http://developer.osdl.org/dev/stp/";
> 

Good idea to give more info.  We had a problem with a busy robot last week 
and I was very happy they gave us enough info.
 
> Also, it would be kind if they obeyed robots.txt, or at least fetched
> it.  My log analyzer will detect robots based just on fetching
> "robots.txt" when beginning a crawl.

We should be obeying robots.txt.

We will be able to do these updates but it may take a little while to get 
to them and deploy.

Judith

> 
> -- Dave
> 

[Index of Archives]     [Linux Kernel]     [Linux DVB]     [Asterisk Internet PBX]     [DCCP]     [Netdev]     [X.org]     [Util Linux NG]     [Fedora Women]     [ALSA Devel]     [Linux USB]

  Powered by Linux