Re: Hardware HD choice...

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



If you are doing batch inserts of data, and want to have reporting queries concurrently running, make sure you have the pg_xlogs on a different disk than the data/indexes.   2 drives RAID 1 for OS + xlogs works great (and these can be SAS if you choose, have a separate partition -- ext2 if it is linux -- for the xlogs.  Then you can easily go with storage capacity and SATA for the main reporting portion.  You just don't want the inserts in batches to slow the whole thing to a crawl due to xlog writes on the same drive array as the reporting.
However, if you can only get a few disks, it is a lot harder to choose between one large array and two of them split without experimenting with both on real data and queries.  It is a quick and easy performance win if you have 6+ disks and do enough writes.

Also, if you intend to have lots of data organized by a time field, and expect to do the reporting/aggregation queries on subsets of that data bounded by time, partitioning by time can have huge benefits.  Partition by month, for example, and sequential scans will only flow to the months of interest if the queries have the right lmits on the date in the where clause.

Partitioning WILL take more development and tuning time, so don't do it unless you know you need it... though if the reporting is mostly restricted to time windows, the impact it has on improving runtimes of aggregation queries is immense.  However, partitioning won't help at all until you have enough data to justify it.

On Thu, Oct 23, 2008 at 10:16 AM, Scott Marlowe <scott.marlowe@xxxxxxxxx> wrote:
On Thu, Oct 23, 2008 at 10:38 AM, Scott Marlowe <scott.marlowe@xxxxxxxxx> wrote:
>>
>> Any other better option that I could ask for ?
>
> Yes, more drives.  4 drives in a RAID10 is a good start.  If you could
> get 8 or 12 in one that's even better.
>

Note that for transactional databases SAS drives are usually
noticeably better, but for reporting databases, SATA drives are
generally fine, with 70-80% the sustained transfer rate at less than
half the cost per megabyte.  I'd recommend 8 SATA drives over 4 SAS
drives for a reporting database.  You'll spend about the same on twice
the number of drives but you'll get much more storage, which is often
useful when you need to work with large datasets.

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux