On 10/01/2010 06:51 PM, Frank Church wrote:
I want to migrate a lot of single user databases onto a single server to allow the users to access them remotely, instead of on the workstations. The databases are quite small and even the most heavily used ones only have at most a few hundred records added to them everyday. The problem is they have a high polling rate, about every 5 secs. If I migrate them to a central servers how much hardware will be needed for them to cope?
A big question here is how the polling is done. Do the polling clients create new connections for each poll? Or do they keep an idle connection around?
If they disconnect between queries, you may want to use a connection pool like PgPool or similar; see the wiki for options. The pool will help reduce the cost of repeated backend startups by keeping backends around when the clients disconnect.
Either way, you'll want enough RAM to keep the "hot" data (or at least the indexes for that data) in memory, with a comfortable margin. If that's impossible to achieve, look at using partial indexes or other things to permit you to maintain small indexes that will fit in memory, because you're going to need it.
Lots of (preferably fast) disks are a big bonus if your data set sees frequent writes or is too big to fit in memory. Disk storage is less critical if your whole dataset fits comfortably in memory and doesn't have heavy write activity.
As for CPU: it depends on if you can fit the whole dataset into memory, and how complex your queries are. If you're doing fairly simple queries without too much CPU-intensive calculation, or your dataset doesn't fit in RAM so the CPU will be waiting on the disks, then there's not much point to a fast CPU. OTOH, if your data set fits in RAM, the CPU and the speed of the RAM will become the limiting factors for query execution speed.
My initial approach was to create views for each user and use updateable views and/or triggers so that the applications can work transparently simply by changing the names of the tables to the names of the views, but I am wondering whether it will be much simpler to use individual databases instead.
There are security trade-offs there, as well as efficiency ones. If you use one database you have much more ability to pool connections, which is great when dealing with polling. Unfortunately, Pg backends can't switch users, so pooling won't do you anywhere near as much good if you're using database-level user access control.
You might want to consider isolating the tables into different schema of the same database. You can use search_path to make this largely transparent, especially with per-user search paths if you're using db user level security.
-- Craig Ringer -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general