Kevin Grittner wrote:
A schema is a logical separation within a database. Table client1.account is a different table from client2.account. While a user can be limited to tables within a single schema, a user with rights to all the tables can join between them as needed. You could put common reference data in a public schema which all users could access in addition to their private schemas
My guess would be that this app will end up being best split by schema. I wonder whether it *also* needs to be split by database, too. 2000 clusters is clearly a nightmare, and putting all the client data into one big table has both performance and security issues; that leaves database and schema as possible splits. However, having 2000 databases in a cluster is probably too many; having 2000 schemas in a database might also be too many. There are downsides to expanding either of those to such a high quantity.
In order to keep both those in the domain where they perform well and are managable, it may be that what's needed is, say, 50 databases with 40 schemas each, rather than 2000 of either. Hard to say the ideal ratio. However, I think that at the application design level, it would be wise to consider each client as having a database+schema pair unique to them, and with the assumption some shared data may need to be replicated to all the databases in the cluster. Then it's possible to shift the trade-off around as needed once the app is built. Building that level of flexibility in shouldn't be too hard if it's in the design from day one, but it would be painful bit of refactoring to do later. Once there's a prototype, then some benchmark work running that app could be done to figure out the correct ratio between the two. It might even make sense to consider full scalability from day one and make the unique client connection info host:port:database:schema.
P.S. Very refreshing to get asked about this before rather than after a giant app that doesn't perform well is deployed.
-- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support greg@xxxxxxxxxxxxxxx www.2ndQuadrant.us -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance