On Fri, Oct 9, 2009 at 10:50 PM, Scott Carey <scott@xxxxxxxxxxxxxxxxx> wrote: > On 10/9/09 2:02 PM, "Merlin Moncure" <mmoncure@xxxxxxxxx> wrote: > >> On Fri, Oct 9, 2009 at 1:46 PM, Scott Otis <scott.otis@xxxxxxxxxx> wrote: >>> Over the next couple of months we will be creating an instance of our >>> solution for each public school district in the US which is around 18,000. >>> That means currently we would be creating 18,000 databases (all on one server >>> right now which is running 8.4). I am assuming this is probably not the >>> best way of doing things. >> >> Schema advantages: >> *) maintenance advantages; all functions/trigger functions can be >> shared. HUGE help if you use them >> *) can query shared data between schemas without major headaches >> *) a bit more efficiency especially if private data areas are small. >> kinda analogous to processes vs threads >> *) Can manage the complete system without changing database sessions. >> This is the kicker IMO. >> >> Database Advantages: >> *) More discrete. Easier to distinctly create, dump, drop, or move to >> separate server >> *) Smaller system catalogs might give efficiency benefits >> > > I'm concerned how a system with 57 * 18000 > 1M tables will function. > > I've got 200,000 tables in one db (8.4), and some tools barely work. The > system catalogs get inefficient when large and psql especially has trouble. > Tab completion takes forever, even if I make a schema "s" with one table in > it and type "s." and try and tab complete -- its as if its scanning all > without a schema qualifier or using an index. Sometimes it does not match > valid tables at all, and sometimes regex matching fails too ('\dt > schema.*_*_*' intermittently flakes out if it returns a lot of matches). > Other than that the number of tables doesn't seem to cause much performance > trouble. The only exception is constraint exclusion which is fundamentally > broken with too many tables on the performance and memory consumption side. > > Having a lot of tables really makes me wish VACUUM, ANALYZE, and other > maintenance tools could partially matched object names with regex though. > > On the other hand, lots of databases probably has performance drawbacks too. > And its maintenance drawbacks are even bigger. > > I certainly don't see any reason at all to try and put all of these in one > schema. The only useful choices are schemas vs databases. I'd go for > schemas unless the performance issues there are a problem. Schemas can be > dumped/restored/backed up independent of one another easily too. They can, but: drop schema foo cascade; is a different operation than: drop database foo; The first is kinda surgical and the second is a rocket launcher. What would you rather have in battle? For the record, just about every database I've ever designed has had some of what I call 'de facto table partitioning' using schemas/search_path tricks. I'm working on a system right now that is going to get very large and if I started to run into psql problems I'd probably look at patching it, maybe \set an option to simplify some of the queries. merlin -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance