On Wed, Apr 25, 2012 at 12:52 PM, Venki Ramachandran <venki_ramachandran@xxxxxxxxx> wrote:
Hi all:Can someone please guide me as to how to solve this problem? If this is the wrong forum, please let me know which one to post this one in. I am new to Postgres (about 3 months into it)I have PostGres 9.0 database in a AWS server (x-large) and a pgplsql program that does some computation. It takes in a date range and for one pair of personnel (two employees in a company) it calculates some values over the time period. It takes about 40ms (milli seconds) to complete and give me the answer. All good so far.Now I have to run the same pgplsql on all possible combinations of employees and with 542 employees that is about say 300,000 unique pairs.So (300000 * 40)/(1000 * 60 * 60) = 3.33 hours and I have to rank them and show it on a screen. No user wants to wait for 3 hours, they can probably wait for 10 minutes (even that is too much for a UI application). How do I solve this scaling problem? Can I have multiple parellel sessions and each session have multiple/processes that do a pair each at 40 ms and then collate the results. Does PostGres or pgplsql have any parallel computing capability.
Setting aside the database concurrency question, have you considered application-level solutions?
How often does a user expect their rank to change? If a daily rank change is fine, trigger the (lengthy) ranking calculation nightly and cache the results in a materialized view for all users; you could continuously rebuild the view to improve freshness to within 4 hours. To go faster with an application-level solution, you will have to reduce your calculation to *what's* likely to be most important to the individual which, again, you can cache; or, if you can predict *who's* most likely to request a ranking, calculate these first; or, both.
These are likely good things to consider regardless of any improvements you make to the back-end ranking calculation, though at you will hit a point of diminishing returns if your ranking calculation drops below some "tolerable" wait. In the web world "tolerable" is about 3 seconds for the general public and about 30 seconds for a captured audience, e.g., employees. YMMV.
Cheers,
Jan