Re: [change req] Allow fedorahosted robots.txt to only crawl /wiki/*

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, 30 Dec 2012 20:07:37 -0500
Ricky Elrod <codeblock@xxxxxxxx> wrote:

> We've been seeing load spikes on hostedXX, following
> df7e8578432b224d9576dc8359f0729763861526. This semi-reverts that
> commit and only allows /wiki/* to be crawled.
> 
> diff --git a/configs/web/fedorahosted.org/fedorahosted-robots.txt
> b/configs/web/fedorahosted.org/fedorahosted-robots.txt
> index cd572f8..7782677 100644
> --- a/configs/web/fedorahosted.org/fedorahosted-robots.txt
> +++ b/configs/web/fedorahosted.org/fedorahosted-robots.txt
> @@ -1,5 +1,5 @@
>  User-agent: *
> -Disallow: /*/browser
> -Disallow: /*/search
> +Allow: /wiki/*
> +Disallow: /
>  user-agent: AhrefsBot
>  disallow: /

It seems like some things are causing the load, but it's hard to
isolate (in particular timeline, changeset and log since they all hit
the repo browser). 

I'm +1 to just applying this one for now and we can adjust it further
after the freeze is over. 

kevin

Attachment: signature.asc
Description: PGP signature

_______________________________________________
infrastructure mailing list
infrastructure@xxxxxxxxxxxxxxxxxxxxxxx
https://admin.fedoraproject.org/mailman/listinfo/infrastructure

[Index of Archives]     [Fedora Development]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [KDE Users]

  Powered by Linux