However, if you can only get a few disks, it is a lot harder to choose between one large array and two of them split without experimenting with both on real data and queries. It is a quick and easy performance win if you have 6+ disks and do enough writes.
Also, if you intend to have lots of data organized by a time field, and expect to do the reporting/aggregation queries on subsets of that data bounded by time, partitioning by time can have huge benefits. Partition by month, for example, and sequential scans will only flow to the months of interest if the queries have the right lmits on the date in the where clause.
Partitioning WILL take more development and tuning time, so don't do it unless you know you need it... though if the reporting is mostly restricted to time windows, the impact it has on improving runtimes of aggregation queries is immense. However, partitioning won't help at all until you have enough data to justify it.
On Thu, Oct 23, 2008 at 10:16 AM, Scott Marlowe <scott.marlowe@xxxxxxxxx> wrote:
On Thu, Oct 23, 2008 at 10:38 AM, Scott Marlowe <scott.marlowe@xxxxxxxxx> wrote:Note that for transactional databases SAS drives are usually
>>
>> Any other better option that I could ask for ?
>
> Yes, more drives. 4 drives in a RAID10 is a good start. If you could
> get 8 or 12 in one that's even better.
>
noticeably better, but for reporting databases, SATA drives are
generally fine, with 70-80% the sustained transfer rate at less than
half the cost per megabyte. I'd recommend 8 SATA drives over 4 SAS
drives for a reporting database. You'll spend about the same on twice
the number of drives but you'll get much more storage, which is often
useful when you need to work with large datasets.
--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance