Re: Apache Multi Process Limit

"Vincent Bray" <noodlet@xxxxxxxxx> · Sun, 10 Jun 2007 00:05:39 +0100

On 09/06/07, Angelo Bourik <jusanotherangel@xxxxxxxxx> wrote:

Hello all,

Hi,

If I understood your question correctly, you've written a spider
script that runs via apache. That script connects to a remote site,
fetches a page by id and parses out the data you need, once per
request to your own server (or does it run all 100,000 requests per
single request to your server?).

My first question is, why are you running your script under apache,
rather than as a command line process? I don't see any advantage to
having apache parent these scripts for you.

As regards having more than one script running at the same time,
that's perhaps a matter of cpu usage, though it does sound odd that
your server is not at least starting to run the others. Perhaps your
script opens and locks your database in such a way that the others are
blocked?

I'd recommend running your spider on the command line, outside of the
server. If you really want to speed it up, split the spider in to a
fetching thread group that is network bound, and a parsing thread
group that munges the queued responses. Or, just write your script in
a faster language :-)

--
noodl

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx
  "   from the digest: users-digest-unsubscribe@xxxxxxxxxxxxxxxx
For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx