Jad madi wrote:
I'm building an RSS aggregator so I'm trying to find out the best way to parse users account feeds equally so Lets say we have 20.000 user with average of 10 feeds in account so we have about 200.000 feed How would you schedule the parsing process to keep all accounts always updated without killing the server? NOTE: that some of the 200.000 feeds might be shared between more than one user Now, what I was thinking of is to split users into 1-) Idle users (check their account once a week, no traffic on their RSS feeds) 2-) Idle++ (check their account once a week, but got traffic on their RSS feeds) 2-) Active users (Check their accounts regularly and they got traffic on their RSS feeds) NOTE: The week is just an example but at the end it’s going to be dynamic ratio so with this classification I can split the parsing power and time to 1-) 10% idle users 2-) 20% idle++ users 3-) 70% active users. NOTE: There is another factors that should be included but I don’t want to get the idea messy now (CPU usage, Memory usage, connectivity issues (if feed site is down) in general the MAX execution time for the continues parsing loop shouldn’t be more than 30 minutes 60 minutes) Actually I’m thinking of writing a daemon to do it “just keep checking CPU/memory” and excute whenever a reasonable amount of resource available without killing the server. Please elaborate.
I would suggest using a queue/pool system. Have one mechanism the loads request for update into a queue and anther that takes the requests out of the pool at the fixed rate (or several at once over different threads/processes). You then limit the rate the messages are consumed to the maximum you want the server to use up. You can then add update requests into the queue based on how often the user checks their feed. That way, the more often they check it, the more often it is updated.
OK - this needs a bit of polishing, but I suspect it would do what you wanted. A database table would make a nice queue as you insert at the bottom and read/delete off the top and let the DB engine synchronize up the separate threads of action.
Cheers AJ -- www.deployview.com www.nerds-central.com www.project-network.com -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php