Re: tuning question

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Jul 12, 2014 at 5:06 PM, Miles Fidelman <mfidelman@xxxxxxxxxxxxxxxx> wrote:
Jeff Trawick wrote:

On Sat, Jul 12, 2014 at 1:25 PM, Miles Fidelman <mfidelman@xxxxxxxxxxxxxxxx <mailto:mfidelman@meetinghouse.net>> wrote:

    Hi Folks,

    Ever once in a while, a crawler comes along and starts indexing
    our site - and in the process pushes our server's load average
    through the roof.

    Short of blocking the crawlers, can anybody suggest some quick
    tuning adjustments to make, to reduce load (setting the max.
    number of servers and/or requests, renicing processes)?


Use robots.txt to block access to dynamically generated resources which are
expensive to generate and not necessary for search hits?

Is it using a lot of concurrent requests, or is the main load issue due to
the cost of the requests it is making?

a bit of both


If you want to limit concurrent requests just from web crawlers, try something like mod_qos.  (See http://unix.stackexchange.com/questions/37481/throttling-web-crawlers)

If it were me, I'd try to block needless, expensive requests with robots.txt too.  http://www.robotstxt.org/robotstxt.html






--
In theory, there is no difference between theory and practice.
In practice, there is.   .... Yogi Berra


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx




--
Born in Roswell... married an alien...
http://emptyhammock.com/


[Index of Archives]     [Open SSH Users]     [Linux ACPI]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Squid]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux