I've been reading this list for a couple of weeks, so I've got some sense of what you folks are likely to recommend, but I'm curious what is considered an ideal storage solution if building a database system from scratch. I just got an exploratory call from my boss, asking what my preference would be, and I didn't really have a great answer ready. Budget is certainly NOT unlimited, but with the right justification, I don't need to pinch pennies, either. The workload: It is a combination of OLTP and data warehouse, but the OLTP workload is trivially light. All of the load comes from the constant data insertion and reporting queries over the that data. Inserts are all performed via COPY. The dataset size is kept small at the moment via very aggressive aggregation and then dropping older, more granular data but I'd like to be able to expand the quantity of data that I keep at each level of aggregation. Inserts are currently occurring at a rate of about 85,000 rows per minute, executed via 3 copy statements of about 50000, 30000, and 5000 rows each into 3 different tables. The copy statements execute in a small fraction of the minute in which they occur. I don't have timings handy, but no more than a couple of seconds. All fact tables are partitioned over time. Data comes into the db already aggregated by minute. I keep 2 hours of minute scale data in a table partitioned by hour. Once per hour, the previous hour of data is aggregated up into an hour scale table. I keep 2 days of hour scale data in a table partitioned by day. Once per day, that gets aggregated up into a day scale table that is partitioned by month. We keep 2 months of day scale data. Once per month, that gets aggregated up into a month scale table and that data is kept indefinitely, at the moment, but may eventually be limited to 3 years or so. All old data gets removed by dropping older partitions. There are no updates at all. Most reporting is done from the 2 hours of minute scale data and 2 months of day scale data tables, which are 4 million and 47 million rows, respectively. I'm not sure the partitioning gets us much, other than making removal of old data much more efficient, since queries are usually over the most recent 60 minutes and most recent 30 days, so tend to involve both partitions to some degree in every query except in the last minute and last day of each time period. We haven't put a lot of effort into tuning the queries since the dataset was MUCH smaller, so there is likely some work to be done just in tuning the system as it stands, but queries are definitely taking longer than we'd like them to, and we expect the volume of data coming into the db to grow in coming months. Ideally, I'd also like to be keeping a much longer series of minute scale data, since that is the data most useful for diagnosing problems in the run time system that is generating the data, though we may still limit queries on that data set to the last 2 hours. I inherited the hardware and have changed absolutely nothing to date. Current hardware - Looks like I've got 4 of these on the host: # cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Xeon(R) CPU 5110 @ 1.60GHz stepping : 6 cpu MHz : 1600.002 cache size : 4096 KB physical id : 0 siblings : 2 core id : 0 cpu cores : 2 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good pni dtes64 monitor ds_cpl vmx tm2 ssse3 cx16 xtpr pdcm dca lahf_lm tpr_shadow bogomips : 3192.31 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual so that's 8 cores total 8 GB of RAM, but it is capable of handling 128GB and I get no resistance when I suggest going to 64GB of RAM. 6 internal drives on battery backed raid (I don't know what RAID level - is there a way to discover this?), all in a single filesystem, so WAL and data are on the same filesystem. I don't believe that we are taking advantage of the battery backed controller, since I only see this in /etc/fstab: UUID=12dcd71d-8aec-4253-815c-b4883195eeb8 / ext3 defaults 1 1 But inserts are happening so rapidly that I don't imagine that getting rid of fsync is going to change performance of the reporting queries too dramatically. Total available storage is 438GB. Database currently occupies 91GB on disk. So my question is, what would be the recommended storage solution, given what I've said above? And at what kind of price points? I have no idea at what price point I'll start to get resistance at the moment. It could be $10K, it could be 5 times that. I really hope it won't be considerably less than that. -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance