On Thu, 2008-08-21 at 22:17 +0800, Amber wrote: > Another question, how many people are there maintaining this huge database. > We have about 2T of compressed SAS datasets, and now considering load them into a RDBMS database, > according to your experience, it seems a single PostgreSQL instance can't manage such size databases well, it that right? Yahoo has a 2PB Postgres single instance Postgres database (modified engine), but the biggest pure Pg single instance I've heard of is 4TB. The 4TB database has the additional interesting property in that they've done none of the standard "scalable" architecture changes (such as partitioning, etc). To me, this is really a shining example that even naive Postgres databases can scale to as much hardware as you're willing to throw at them. Of course, clever solutions will get you much more bang for your hardware buck. As for my personal experience, I'd say that the only reason that we're currently running a dual Pg instance (Master/Replica/Hot Standby) configuration is for report times. It's really important to us to have snappy access to our data warehouse. During maintenance our site and processes can easily be powered by the master database with some noticeable performance degradation for the users. The "grid" that we (I) am looking to build is coming out of changing (yet ever static!) business needs: we're looking to immediately get 2x the data volume and soon need to scale to 10x. Couple this with increased user load and the desire to make reports run even faster than they currently do and we're really going to run up against a hardware boundary. Besides, writing grid/distributed databases is *fun*! Uh, for a one sentence answer: A single Pg instance can absolutely handle 2+ TB without flinching. > How many CPU cores and memory does your server have :) My boss asked me not to answer the questions I missed... sorry. I will say that the hardware is pretty modest, but has good RAM and disk space. -Mark