Re: suddendly high cpu load because of googlebot

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Mar 9, 2012 at 8:07 AM, Simone Frattegiani
<simone.frattegiani@xxxxxxxxx> wrote:
> Hello,
>
> i suddendly started having CPU load issues, like this:

A couple of things:

> Server-status shows this:
>
> […]
> ReqPerSec: 6.37383
> BusyWorkers: 99
> IdleWorkers: 10
> Scoreboard: WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW
>

This is what worries me. You are handling 6.3 requests per second, but
have 100 active workers handling requests. This means your requests
are really, really slow. When requests are really slow, and the server
gets a little busy, two things happen:

1) Requests take even longer to serve, as the resource contention
(probably database) increases
2) Apache has to serve more simultaneous requests, and so has to have
more children, which uses more CPU to start and run the processes, and
more RAM is used on the new process.

When this reaches the tipping point, which can be due to not enough
RAM, not enough CPU, not enough IOPs, Apache has to start processes
faster than it can serve requests, and load will metaphorically
explode.

> […]
> The commands "netstat -a | grep 66.249" shows 20 connections from
> googlebot ip, 1 in ESTABILISHEd status, the others in TIME WAIT.
>

So this may be largely irrelevant. If they are all in TIME_WAIT, then
that is a keep alive connection. A keep alive connection isn't using
your CPU, so it isn't slowing down your webserver. What it is doing is
taking up a slot, which isn't ideal.

> If i restart apache, everything gets back to normal.
>
> Any suggestions?
> THanks!
>

I have a couple!

First of all, you are using prefork MPM. This means to get an extra
slot, apache has to fork and start up a new child. This is not
efficient! If you use the worker or event MPM, then each child has
multiple threads, and so you do not require to start, or keep
starting, so many children. You will also have less CPU/RAM costs per
slot.

This will stop CPU usage exploding when a lot of requests come in, and
hopefully you can serve requests faster.

Finally, if you are worried about keep alive connections from google
bot, you should seriously consider event MPM. This will use a single
thread to handle ALL keep alive connections, waiting for a new request
on it, and handing the request off to a different thread once there is
data. This means you don't use multiple slots to handle keep alive
connections.

Finally, you don't mention it, but I assume you are running some web
application, like PHP? I would recommend using fastcgi to host the
application, divorce it entirely from apache.
This will give you a clear idea of how much of your resources are
being devoted to PHP, and will actually reduce the amount of resources
over using mod_php. mod_php adds a PHP interpreter in every apache
child, whether it is serving a PHP request, a static file from disk,
or monitoring a keep alive connection.

You can also strictly control how many PHP processes are used when
running with fastcgi.

Cheers

Tom

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx
For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx




[Index of Archives]     [Open SSH Users]     [Linux ACPI]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Squid]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux