Re: How many databases can PostgreSQL open at a time?

Craig Ringer <craig@xxxxxxxxxxxxxxxxxxxxx> · Fri, 01 Oct 2010 19:47:06 +0800

On 10/01/2010 06:51 PM, Frank Church wrote:
I want to migrate a lot of single user databases onto a single server
to allow the users to access them remotely, instead of on the
workstations. The databases are quite small and even the most heavily
used ones only have at most a few hundred records added to them
everyday.

The problem is they have a high polling rate, about every 5 secs. If I
migrate them to a central servers how much hardware will be needed for
them to cope?

A big question here is how the polling is done. Do the polling clients 
create new connections for each poll? Or do they keep an idle connection 
around?

If they disconnect between queries, you may want to use a connection 
pool like PgPool or similar; see the wiki for options. The pool will 
help reduce the cost of repeated backend startups by keeping backends 
around when the clients disconnect.

Either way, you'll want enough RAM to keep the "hot" data (or at least 
the indexes for that data) in memory, with a comfortable margin. If 
that's impossible to achieve, look at using partial indexes or other 
things to permit you to maintain small indexes that will fit in memory, 
because you're going to need it.

Lots of (preferably fast) disks are a big bonus if your data set sees 
frequent writes or is too big to fit in memory. Disk storage is less 
critical if your whole dataset fits comfortably in memory and doesn't 
have heavy write activity.

As for CPU: it depends on if you can fit the whole dataset into memory, 
and how complex your queries are. If you're doing fairly simple queries 
without too much CPU-intensive calculation, or your dataset doesn't fit 
in RAM so the CPU will be waiting on the disks, then there's not much 
point to a fast CPU. OTOH, if your data set fits in RAM, the CPU and the 
speed of the RAM will become the limiting factors for query execution speed.

My initial approach was to create views for each user and use
updateable views and/or triggers so that the applications can work
transparently simply by changing the names of the tables to the names
of the views, but I am wondering whether it will be much simpler to
use individual databases instead.

There are security trade-offs there, as well as efficiency ones. If you 
use one database you have much more ability to pool connections, which 
is great when dealing with polling. Unfortunately, Pg backends can't 
switch users, so pooling won't do you anywhere near as much good if 
you're using database-level user access control.

You might want to consider isolating the tables into different schema of 
the same database. You can use search_path to make this largely 
transparent, especially with per-user search paths if you're using db 
user level security.

--
Craig Ringer

--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general