Hi,
I am a new user of PostgreSQL and there are some questions about its performance in a scenario with a high requisition rate
Lets picture an imaginary scenario: In my system (Debian Linux), there are 200.000.000 records on the database, and a total number of 10.000 diferent users. (the manual stated the following: there is a main process called postmaster. It starts new processes to each different request and each different user ... I dont understand very well ... please correct me if I`m wrong)
You're right. It will start a seperate process for each connection.
If all users try to access, through WEB, at same time the database, what happens: 1. With the OS? Will it crash?
No. At least, no self-respecting posix system will crash. None that I know of, anyway.
2. Will the Postmaster process startup 10.000 diferent processes at the same time?
No.
3. What about performance? Is there any peformance downgrade?
Yes. At that load you'll almost definately see a massive performance problem.
4. What is the best solution for this problem?
You're presenting an unrealistic scenerio, and I'll explain why in a moment.
5. How many simultaneos requests may the Postmaster open withouth decreasing perfomance?
Depends on the hardware.
Fact is, the scenerio of "10,000 users access at the same time" will almost never happen ... especially not through the web. That would be one tremendiously popular website.
First off, any web browser I've ever seen puts a cap on the number of simultaneous connections. Usually around a few hundred. Let's say your web server has a cap of 200 simultaneous connections (not unusual) and you get 10,000 requests at exactly the same moment (unlikely in any case) Your web browser will immediately start servicing 200 of the requests. The remaining 9,800 will be queued to be handled as soon as one of the 200 is complete. Since web requests generally finish fairly quickly, you'll actually see the 10,000 get serviced in short order, although not as quickly as the 9,999th surfer would like, I'm sure.
However, it's likely that your operating system won't be able to queue that big of a backlog, and quite a few of those attempts will return an error that the server is too busy.
On the flip side, let's do some numbers, if you're getting 10,000 request per second, that's 864,000,000 hits per day ... are you actually expecting that amount of traffic? 10,000 per second doesn't even qualify as "at the same time".
Evaluating your needs would better be accomplished by calculating the max load over a fixed period of time, determining how long an average request takes, and using that to figure out the number of processes that will need to be running. For example:
If I figure that between 9:00 and 11:00 am is the busiest it will get, and I'll get approximate 100,000 hits, that's about 14 hits per second, and if each request takes about 3 seconds, I can figure that I'll have 42 requests active during any one second of that time. Not too bad of a load.
To be safe, I double that, and set Apache's max processes to 100, then set Postgres max processes to the same. Then I spec out hardware that can handle the load and I'm off and running.
-- Bill Moran Potential Technologies http://www.potentialtech.com
---------------------------(end of broadcast)--------------------------- TIP 5: Have you checked our extensive FAQ?
http://www.postgresql.org/docs/faqs/FAQ.html