Hi folks.
I have a project coming up where I will need to process a bazillion (OK,
a few million) records, possibly with multiple steps. (In this case I'm
reading data from one data archive into an Apache Solr server.) This is
a natural use case for a queue server, I believe, and while the
requirements of the project do not dictate a language it makes sense to
me to use PHP for the processing code since 1) Other parts of the
project will be using it for web-facing logic and 2) It's the language I
know best.
I'm trying to select a queue server to use. The two I'm investigating
in particular are Beanstalkd (http://kr.github.com/beanstalkd/) and
Gearman (http://gearman.org/). In this case I do need a reliable queue,
even if that means a record gets processed multiple times by accident
(which in this use case is fine).
Has anyone worked with either of these systems? Any war stories to
share, good or bad? Any guidelines on the number of resources we need?
For Beanstalk, I've found two user-space PHP libraries, one of which is
apparently dead. The other is:
https://github.com/pda/pheanstalk/
For Gearman, there appears to be both a PECL module and a PEAR module.
http://pear.php.net/package/Net_Gearman/
http://pecl.php.net/package/gearman
(Naturally they do not appear to be mirrors of each other, just to make
life difficult.)
I do have access to install PECL modules on the server(s) in question if
appropriate.
Any experience/advise/horror stories that would help us settle on a
queue and API library?
--Larry Garfield
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php