Search Postgresql Archives

Re: Partitioning Vs. Split Databases - performance?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thursday 21 December 2006 11:47, Ron Johnson wrote:
> This gives you linear growth potential, since if your current box
> gets over-utilized, buy a 2nd box and move some of the databases to it.

So far, I'm inclined to go this way, due to the option for linear scaling. 

> >> 2) Copy out the data specific to a customer and load into separate tables
> >> (with slightly different names, EG table "dates" becomes "cust1_dates")
> >> and 
> >> use data partitioning to help with performance as needed.
> 
> Definitely *not* scalable.  And *very* messy.  Yech.

Scales better than present, methinks, but still not the best idea. I'd have to 
revisit all my queries to make sure that they use the correct tablename. 

> > 3) Put each customer in their own schema/namespace which resides within
> > its own table space.
> >
> > Then you can move customers wherever you need in terms of IO.

How is that functionally different than using a separate database? What's the 
advantage here? I don't *need* to restrict myself to one database, and doing 
this does require that I revisit 100% of the SQL queries to make sure that 
I'm referencing the right schema. 

This solution seems to have the same problems as using dynamic tablenames. 

> Splitting like mentioned in these three tactics means that you've
> now got 2x as many tables.  Add more customers and you've got that
> many more tables.  Perfect candidate for "schema drift".
> 
> If each table has cust_id in it, then you could:
> 
> 4) retain 1 database and partition each table on cust_id.
> 
> http://www.postgresql.org/docs/8.1/interactive/ddl-partitioning.html

Intriguing idea, and one that I might do in the future. However, I have 170 
normalized tables defined at present. The task of auditing each of these 
tables for the constraint ranges sounds somewhat nightmarish. Not all the 
tables have the cust_id record defined - some are implicit. 

> >> Given the same physical hardware, which one is likely to perform better? 
Does
> >> it make any difference? Does using separate databases use more RAM than a
> >> single database with a bunch of different tables?
> 
> Config files are global, so I doubt it.
> 
> >> Company is growing rapidly, so growth room is important...
> 
> Then go for Option 1.

My conclusion, too. Another poster mentioned schema drift, and that's a real 
concern, but we're already updating the schema through a script which could 
be extended to update all databases, not just one, so I feel this problem 
would be minor to non-existent. 

Thanks! 

-Ben 
-- 
"The best way to predict the future is to invent it."
- XEROX PARC slogan, circa 1978


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux