Hi Jan, Thank you so much for your input and sorry about the delay. I understand most of the limitations, compromises, and sacrifices that come with distributed databases (sharding). But definitely not all of them. After working with databases for about 25 years, I have some basic knowledge about how they work (mssql, oracle, sybase). But I am new to postgres and new to distributed db approach, so any feedback in those areas is extremely welcome :-) If anybody is aware of any successful implementations of distributed approach with postgres and any information around this area, please let me know. If I cannot find a suitable prototype and has to “invent the wheel”, I hope to be able to contribute back to the community with my story one day. Thank you once again, -igor From: Jan Nielsen [mailto:jan.sture.nielsen@xxxxxxxxx] Hi Igor, On Tue, May 29, 2012 at 2:13 PM, Igor Shmain <igor.shmain@xxxxxxxxx> wrote: I need to design a solution for a database which will grow and will require a horizontal split at some moment. Ok
Since PG does not support the notion of a shard at the database layer, you will have to create this yourself. The key to scaling shards is ensuring there there are no cross-shard joins (or any other shared data); this is an application-design question so unless you "own" the application-design, this will be difficult to achieve. If you do own the design, the application's entity-tree will have to be carefully constructed to avoid any sharing of resources across shards. RDBMS programmers are taught just the opposite so this will require diligence from your team with a very clear, likely simple, application-entity design amenable to shard-ing.
As I mentioned above, it is unlikely that you will be able to shard an RDBMS at this level; you need to look at the application entity-design.
If you mean to dictate a shard solution, very few will be able to help. On the other hand, if you mean to ask for help in finding a "low-cost scale-able PG DB solution for a typical RDBMS application with the potential for rapid growth", then there are a number of options which have been employed with success. For predominately read applications, use a single master database replication strategy: For predominately write applications, you will likely need a multi-master solution (unless you can dictate the application entity-design) which is more complex territory from a scale-ability and application design standpoint but you might look at Postgres-XS and Londiste for ideas. Cheers, Jan |