Re: How would you do this ?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jad madi wrote:
I'm building an RSS aggregator so I'm trying to find out the best way to
parse users account feeds equally so Lets say we have 20.000 user with
average of 10 feeds in account so we have about
200.000 feed

How would you schedule the parsing process to keep all accounts always
updated without killing the server? NOTE: that some of the 200.000 feeds
might be shared between more than one user

Now, what I was thinking of is to split users into
1-) Idle users (check their account once a week, no traffic on their RSS
feeds)
2-) Idle++ (check their account once a week, but got traffic on their
RSS feeds)
2-) Active users (Check their accounts regularly and they got traffic on
their RSS feeds)

NOTE: The week is just an example but at the end it’s going to be
dynamic ratio

so with this classification I can split the parsing power and time to
1-) 10% idle users
2-) 20% idle++ users
3-) 70% active users.

NOTE: There is another factors that should be included but I don’t want
to get the idea messy now (CPU usage, Memory usage, connectivity issues
(if feed site is down) in general the MAX execution time for the
continues parsing loop shouldn’t be more than 30 minutes 60 minutes)
Actually I’m thinking of writing a daemon to do it “just keep checking
CPU/memory” and excute whenever a reasonable amount of resource
available without killing the server.


Please elaborate.

I would suggest using a queue/pool system. Have one mechanism the loads request for update into a queue and anther that takes the requests out of the pool at the fixed rate (or several at once over different threads/processes). You then limit the rate the messages are consumed to the maximum you want the server to use up. You can then add update requests into the queue based on how often the user checks their feed. That way, the more often they check it, the more often it is updated.

OK - this needs a bit of polishing, but I suspect it would do what you wanted. A database table would make a nice queue as you insert at the bottom and read/delete off the top and let the DB engine synchronize up the separate threads of action.

Cheers

AJ

--
www.deployview.com
www.nerds-central.com
www.project-network.com

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux