Re: Postgres for a "data warehouse", 5-10 TB

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On Sun, Sep 11, 2011 at 7:52 AM, Scott Marlowe <scott.marlowe@xxxxxxxxx> wrote:
On Sun, Sep 11, 2011 at 6:35 AM, Igor Chudov <ichudov@xxxxxxxxx> wrote:
> I have a server with about 18 TB of storage and 48 GB of RAM, and 12
> CPU cores.

1 or 2 fast cores is plenty for what you're doing.

I need those cores to perform other tasks, like image manipulation with imagemagick, XML forming and parsing etc.
 
 But the drive
array and how it's configured etc are very important.  There's a huge
difference between 10 2TB 7200RPM SATA drives in a software RAID-5 and
36 500G 15kRPM SAS drives in a RAID-10 (SW or HW would both be ok for
data warehouse.)

Well, right now, my server has twelve 7,200 RPM 2TB hard drives in a RAID-6 configuration.

They are managed by a 3WARE 9750 RAID CARD.
 
I would say that I am not very concerned with linear relationship of read speed to disk speed. If that stuff is somewhat slow, it is OK with me. 

What I want to avoid is severe degradation of performance due to size (time complexity greater than O(1)), disastrous REPAIR TABLE operations etc. 


> I do not know much about Postgres, but I am very eager to learn and
> see if I can use it for my purposes more effectively than MySQL.
> I cannot shell out $47,000 per CPU for Oracle for this project.
> To be more specific, the batch queries that I would do, I hope,

Hopefully if needs be you can spend some small percentage of that for
a fast IO subsystem is needed.



I am actually open for suggestions here.
 
> would either use small JOINS of a small dataset to a large dataset, or
> just SELECTS from one big table.
> So... Can Postgres support a 5-10 TB database with the use pattern
> stated above?

I use it on a ~3TB DB and it works well enough.  Fast IO is the key
here.  Lots of drives in RAID-10 or HW RAID-6 if you don't do a lot of
random writing.

I do not plan to do a lot of random writing. My current design is that my perl scripts write to a temporary table every week, and then I do INSERT..ON DUPLICATE KEY UPDATE. 

By the way, does that INSERT UPDATE functionality or something like this exist in Postgres?

i

[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux