Geoffrey wrote: > We are going live with a application in a few months that is a complete > rewrite of an existing application. We are moving from an existing > proprietary database to Postgresql. We are looking for some > insight/suggestions as to how folks test Postgresql in such a situation. Shouldn't you run your tests *before* rewriting your application? :). You don't have to answer that. > We're also trying to decide whether a single database with multiple > schemas or multiple databases are the best solution. We've done some > research on this through the archives, and the answer seems to depend on > the database/application design. Still, we welcome any generic ideas on > this issue as well. I can help a little bit here. Yes, this decision will be heavily influenced by application design. Let's assume you have to keep multiple identical table sets (suppose you have multiple companies on the same server for example). Here are some general stipulations: Reasons to use schemas: * If you have a requirement where data must be queried from multiple data stores at the same time, or between various data stores and a shared area, this argues for schemas. While it is possible to do this without schemas via dblink, which is the postgresql inter-database rpc, performance can be an issue and there is some overhead of setting it up. * If you need to swap out data stores on the fly without reconnecting, then this argues strongly in favor of schemas. With schemas, you can manipulate which datastore you are using by simply manipulating the search_path. There is one big caveat to this: your non dynamic pl/pgsql functions will stick to the tables they use following the first time you run them like suction cups. Worse, your sql functions will stick to the tables they refer to when compiled, making non-immutable sql functions a no-no in a multi-schema environment. However, there is a clever workaround to this by force recompiling you pl/pgsql functions (search the recent archives on this list). * Finally, since multiple schemas can share a common public area, this means that if you have to deploy database features that apply to all of your datastores, you can sometimes get away with sticking them in a public area of the databse...server side utility functions are an example of this. Reasons to use databases: * Certain third party tools may have trouble with schemas. * Manipulating the search path can be error prone and relatively tedious. * Database are more fully separate. I run multi schema, and I make heavy use of the userlock contrib module. This means I have to take special care not to have inter-schema overlap of my lock identifier. There are other cases where this might bite you, for example if you wanted one data store to respond to notifications but not another. These are solvable problems, but can be a headache. In short, there are pros and cons either way. If it's any help, the servers I administrate, which have *really complex* data interdependency and isolation requirements, use schemas for the extra flexibility. Merlin